2016-06-17 12 views
5

Ho bisogno di scorrere su ogni riga di un pda df e trasformarlo in una stringa separata da virgole.Come trasformare una riga dataframe panda in una stringa separata da virgola

esempio:

df3 = DataFrame(np.random.randn(10, 5), 
       columns=['a', 'b', 'c', 'd', 'e']) 


      a   b   c   d   e 
0 -0.158897 -0.749799 0.268921 0.070035 0.099600 
1 -0.863654 -0.086814 -0.614562 -1.678850 0.980292 
2 -0.098168 0.710652 -0.456274 -0.373153 -0.533463 
3 1.001634 -0.736187 -0.812034 0.223062 -1.337972 
4 0.173549 -0.576412 -1.016063 -0.217242 0.443794 
5 0.273695 0.335562 0.778393 -0.668368 0.438880 
6 -0.783824 1.439888 1.057639 -1.825481 -0.770953 
7 -1.025004 0.155974 0.645023 0.993379 -0.812133 
8 0.953448 -1.355628 -1.918317 -0.966472 -0.618744 
9 -0.479297 0.295150 -0.294449 0.679416 -1.813078 

mi piacerebbe ottenere per ogni riga:

'-0.158897,-0.749799,0.268921,0.070035,0.099600' 
'0.863654,-0.086814,-0.614562,-1.678850,0.980292' 
... and so on 
+0

Vuoi che ogni riga sia una stringa separata? – miradulo

+0

sì, ho bisogno che siano separati –

risposta

8

Si potrebbe usare pandas.DataFrame.to_string con alcuni argomenti opzionali impostato su false e poi dividere in caratteri di nuova riga per ottenere una lista delle corde. Questo sembra un po 'sporco però.

x = df3.to_string(header=False, 
        index=False, 
        index_names=False).split('\n') 
vals = [','.join(ele.split()) for ele in x] 
print(vals) 

Uscite:

['1.221365,0.923175,-1.286149,-0.153414,-0.005078', '-0.231824,-1.131186,0.853728,0.160349,1.000170', '-0.147145,0.310587,-0.388535,0.957730,-0.185315', '-1.658463,-1.114204,0.760424,-1.504126,0.206909', '-0.734571,0.908569,-0.698583,-0.692417,-0.768087', '0.000029,0.204140,-0.483123,-1.064851,-0.835931', '-0.108869,0.426260,0.107286,-1.184402,0.434607', '-0.692160,-0.376433,0.567188,-0.171867,-0.822502', '-0.564726,-1.084698,-1.065283,-2.335092,-0.083357', '-1.429049,0.790535,-0.547701,-0.684346,2.048081'] 
1

È possibile canvert DataFrame a numpy.array da values e quindi generare strings:

b = '\n'.join(','.join('%0.3f' %x for x in y) for y in df.values) 
print (b) 
-1.245,-0.397,-0.374,0.698,-0.057 
-1.695,-1.593,0.992,-1.839,0.980 
1.154,-0.322,-0.583,1.022,1.800 
-1.705,0.148,-0.670,0.164,0.902 
1.573,-1.082,-0.243,-1.190,0.832 
2.535,-1.168,-0.258,-2.617,-0.766 
1.990,0.607,-0.115,0.114,0.175 
-0.652,0.245,-1.501,0.145,-0.079 
-1.977,3.543,-0.454,1.697,-0.648 
-0.756,0.561,-1.294,-0.747,-0.323 

Se necessario strings in list:

b = list(','.join('%0.3f' %x for x in y) for y in df.values) 
print (b) 
['-1.139,0.257,-1.132,-0.987,1.194', '0.799,-1.061,-1.073,-0.176,0.528', '0.527,0.333,-0.185,-0.496,0.115', '-1.567,0.268,-1.457,2.121,-0.065', '-0.854,-2.344,0.747,0.208,-0.403', '1.850,0.084,1.890,-1.458,0.427', '1.649,0.134,-2.314,1.618,0.658', '2.178,-0.823,-0.499,0.083,-0.269', '-0.781,-0.212,1.623,-0.053,0.436', '0.842,-0.167,1.914,-0.087,0.717'] 
+2

cosa succede se ho anche stringhe come valori –

0

Uso to_csv:

df = pd.DataFrame(np.random.randn(10, 5), 
        columns=['a', 'b', 'c', 'd', 'e']) 
df.to_csv(header=None, index=False).strip('\n').split('\n') 

['-1.60092768589,-0.746496859432,0.662527724304,-0.677984969682,1.70656657572', 
'-0.432306620615,-0.396499851892,0.564494290965,-1.01196068617,-0.630576490671', 
'-3.28916785414,0.627240166663,-0.359262938883,0.344156143177,-0.911269843378', 
'-0.272741450301,0.0594234886507,-2.72800253986,-0.821610087419,-0.0668212419497', 
'0.303490090149,-1.61344483051,0.117046351282,-1.46936429231,-0.66018613208', 
'-1.18157229705,-0.766519504863,0.386180129978,0.945274532852,-0.783459830884', 
'-1.27118723107,-1.12478330038,-0.625470220821,-0.9,0.0641830786961', 
'-1.02657336234,-1.01556460318,0.445282883845,0.589873985417,-0.833648685855', 
'0.742343897524,-1.69644542886,-1.03886940911,0.511317569685,1.87084848086', 
'-0.159125435887,1.02522202275,0.254459603867,-0.487187861352,2.31900012693'] 

Nota: questo deve essere migliorata se si dispone di \n nelle vostre cellule.