I have a csv file structered like this:
| publish_date |sentence_number|character_count| sentence |
----------------------------------------------------------------------------
| 1 | | | |
----------------------------------------------------------------------------
| 02/01/2012 00:12:00 | -1 | 0 | Sentence 1 here. |
----------------------------------------------------------------------------
| 02/01/2012 00:12:00 | 0 | 14 | Sentence 2 here. |
----------------------------------------------------------------------------
| 02/01/2012 00:12:00 | 1 | 28 | "Sentence 3 here. |
----------------------------------------------------------------------------
| 02/01/2012 00:12:00 | 2 | 42 | Sentence 4 here." |
----------------------------------------------------------------------------
| 02/01/2012 00:12:00 | 3 | 56 | Sentence 5 here. |
----------------------------------------------------------------------------
| end | | | |
----------------------------------------------------------------------------
| 2 | | | |
----------------------------------------------------------------------------
| 02/01/2012 00:12:00 | -1 | 0 | Sentence 1 here. |
----------------------------------------------------------------------------
| 02/01/2012 00:12:00 | 0 | 14 | Sentence 2 here. |
----------------------------------------------------------------------------
| end | | | |
----------------------------------------------------------------------------
| end | | | |
----------------------------------------------------------------------------
What I'd like to do is combine each block of sentences into paragraphs to output individual paragraphs:
["Sentence 1 here.", "Sentence 2 here.", ""Sentence 3 here.", "Sentence 4 here."", "Sentence 5 here."]
Some sentences are quotes which continue into a new sentence, whilst others are entirely embedded within a sentence.
So far I've got this:
def read_file():
file = open('test.csv', "rU")
reader = csv.reader(file)
included_cols = [3]
for row in reader:
content = list(row[i] for i in included_cols)
print content
return content
read_file()
But this just outputs a list of sentences like so:
['Sentence 1 here.']
['Sentence 2 here.']
Any suggestions appreciated.
Aucun commentaire:
Enregistrer un commentaire