7.1. CSV About

  • CSV - Comma/Character Separated Values

  • No CSV formal standard, just a good practice

  • Flat file (2D) without relations

  • Relations has to be flatten (serialization, additional columns, etc...)

  • Typically first line (header) represents column names

  • Rarely first line can also have a structure (nrows, ncols)

  • Internationalization: encoding

  • Localization: decimal separator, thousands separator, date format

  • Parameters: delimiter, quotechar, quoting, lineterminator, dialect

Example CSV file:

SepalLength, SepalWidth, PetalLength, PetalWidth, Species
5.8, 2.7, 5.1, 1.9, virginica
5.1, 3.5, 1.4, 0.2, setosa
5.7, 2.8, 4.1, 1.3, versicolor
7.3, 2.9, 6.3, 1.8, virginica
5.6, 2.5, 3.9, 1.1, versicolor
5.4, 3.9, 1.3, 0.4, setosa

7.1.2. Variants

CSV file with numeric values:

5.8, 2.7, 5.1, 1.9, 2
5.1, 3.5, 1.4, 0.2, 0
5.7, 2.8, 4.1, 1.3, 1
3, 4, setosa, versicolor, virginica
5.8, 2.7, 5.1, 1.9, 2
5.1, 3.5, 1.4, 0.2, 0
5.7, 2.8, 4.1, 1.3, 1

CSV file with text values. First line is a header:

Firstname, Lastname, Born
Melissa, Lewis, 1995-07-15
Rick, Martinez, 1996-01-21
Alex, Vogel, 1994-11-15
Chris, Beck, 1999-08-02
Beth, Johanssen, 2006-05-09
Mark, Watney, 1994-10-12

7.1.3. Delimiter

delimiter=',':

SepalLength, SepalWidth, PetalLength, PetalWidth, Species
5.8, 2.7, 5.1, 1.9, virginica
5.1, 3.5, 1.4, 0.2, setosa
5.7, 2.8, 4.1, 1.3, versicolor

delimiter=';':

SepalLength; SepalWidth; PetalLength; PetalWidth; Species
5.8; 2.7; 5.1; 1.9; virginica
5.1; 3.5; 1.4; 0.2; setosa
5.7; 2.8; 4.1; 1.3; versicolor

delimiter=':':

root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
nobody:x:99:99:Nobody:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
watney:x:1000:1000:Mark Watney:/home/watney:/bin/bash
lewis:x:1001:1001:Melissa Lewis:/home/lewis:/bin/bash
martinez:x:1002:1002:Rick Martinez:/home/martinez:/bin/bash

delimiter='|':

| Firstname | Lastname | Role      |
|-----------|----------|-----------|
| Mark      | Watney   | Botanist  |
| Melissa   | Lewis    | Commander |
| Rick      | Martinez | Pilot     |

delimiter='\t':

SepalLength SepalWidth      PetalLength     PetalWidth      Species
5.8 2.7     5.1     1.9     virginica
5.1 3.5     1.4     0.2     setosa
5.7 2.8     4.1     1.3     versicolor

7.1.4. Quotechar

  • " - quote char (best)

  • ' - apostrophe

quotechar='"':

"SepalLength", "SepalWidth", "PetalLength", "PetalWidth", "Species"
"5.8", "2.7", "5.1", "1.9", "virginica"
"5.1", "3.5", "1.4", "0.2", "setosa"
"5.7", "2.8", "4.1", "1.3", "versicolor"

quotechar="'":

'SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species'
'5.8', '2.7', '5.1', '1.9', 'virginica'
'5.1', '3.5', '1.4', '0.2', 'setosa'
'5.7', '2.8', '4.1', '1.3', 'versicolor'

quotechar='|':

|SepalLength|, |SepalWidth|, |PetalLength|, |PetalWidth|, |Species|
|5.8|, |2.7|, |5.1|, |1.9|, |virginica|
|5.1|, |3.5|, |1.4|, |0.2|, |setosa|
|5.7|, |2.8|, |4.1|, |1.3|, |versicolor|

quotechar='/':

/SepalLength/, /SepalWidth/, /PetalLength/, /PetalWidth/, /Species/
/5.8/, /2.7/, /5.1/, /1.9/, /virginica/
/5.1/, /3.5/, /1.4/, /0.2/, /setosa/
/5.7/, /2.8/, /4.1/, /1.3/, /versicolor/

7.1.5. Quoting

  • csv.QUOTE_ALL (safest)

  • csv.QUOTE_MINIMAL

  • csv.QUOTE_NONE

  • csv.QUOTE_NONNUMERIC

quoting=csv.QUOTE_ALL:

"SepalLength", "SepalWidth", "PetalLength", "PetalWidth", "Species"
"5.8", "2.7", "5.1", "1.9", "virginica"
"5.1", "3.5", "1.4", "0.2", "setosa"
"5.7", "2.8", "4.1", "1.3", "versicolor"

quoting=csv.QUOTE_MINIMAL:

SepalLength, SepalWidth, PetalLength, PetalWidth, Species
5.8, 2.7, 5.1, 1.9, virginica
5.1, 3.5, 1.4, 0.2, setosa
5.7, 2.8, 4.1, 1.3, versicolor

quoting=csv.QUOTE_NONE:

SepalLength, SepalWidth, PetalLength, PetalWidth, Species
5.8, 2.7, 5.1, 1.9, virginica
5.1, 3.5, 1.4, 0.2, setosa
5.7, 2.8, 4.1, 1.3, versicolor

quoting=csv.QUOTE_NONNUMERIC:

"SepalLength", "SepalWidth", "PetalLength", "PetalWidth", "Species"
5.8, 2.7, 5.1, 1.9, "virginica"
5.1, 3.5, 1.4, 0.2, "setosa"
5.7, 2.8, 4.1, 1.3, "versicolor"

7.1.6. Lineterminator

  • \r\n - New line on Windows

  • \n - New line on *nix

  • *nix operating systems: Linux, macOS, BSD and other POSIX compliant OSes (excluding Windows)

7.1.7. Decimal Separator

  • 0.1 - Decimal point

  • 0,1 - Decimal comma

../../_images/l10n-decimal-separator.png
SepalLength, SepalWidth, PetalLength, PetalWidth, Species
5.8; 2.7; 5.1; 1.9; virginica
5.1; 3.5; 1.4; 0.2; setosa
5.7; 2.8; 4.1; 1.3; versicolor
SepalLength, SepalWidth, PetalLength, PetalWidth, Species
5,8; 2,7; 5,1; 1,9; virginica
5,1; 3,5; 1,4; 0,2; setosa
5,7; 2,8; 4,1; 1,3; versicolor

7.1.8. Thousands Separator

  • 1000000 - None

  • 1'000'000 - Apostrophe

  • 1 000 000 - Space, the internationally recommended thousands separator

  • 1.000.000 - Period, used in many non-English speaking countries

  • 1,000,000 - Comma, used in most English-speaking countries

7.1.9. Date and Time

>>> date = '1961-04-12'
>>> date = '12.4.1961'
>>> date = '12.04.1961'
>>> date = '12-04-1961'
>>> date = '12/04/1961'
>>> date = '4/12/61'
>>> date = '4.12.1961'
>>> date = 'Apr 12, 1961'
>>> date = 'Apr 12th, 1961'
>>> time = '12:00:00'
>>> time = '12:00'
>>> time = '12:00 pm'
>>> duration = '04:30:00'
>>> duration = '4h 30m'
>>> duration = '4 hours 30 minutes'

7.1.10. Encoding

  • utf-8 - international standard (should be always used!)

  • iso-8859-1 - ISO standard for Western Europe and USA

  • iso-8859-2 - ISO standard for Central Europe (including Poland)

  • cp1250 or windows-1250 - Central European encoding on Windows

  • cp1251 or windows-1251 - Eastern European encoding on Windows

  • cp1252 or windows-1252 - Western European encoding on Windows

  • ASCII - ASCII characters only

with open(FILE, encoding='utf-8') as file:
    ...

7.1.11. Dialects

import csv

csv.list_dialects()
# ['excel', 'excel-tab', 'unix']
  • Microsoft Excel 2016-2020:

    • quoting=csv.QUOTE_MINIMAL

    • quotechar='"'

    • delimiter=',' or delimiter=';' depending on Windows locale decimal separator

    • lineterminator='\r\n'

    • encoding='...' - depends on Windows locale typically windows-*

  • Microsoft Excel macOS:

    • quoting=csv.QUOTE_MINIMAL

    • quotechar='"'

    • delimiter=','

    • lineterminator='\r\n'

    • encoding='utf-8'

  • Microsoft export options:

../../_images/csv-standard-dialects.png
$ file utf8.csv
utf8.csv: CSV text

$ cat utf8.csv
Firstname,Lastname,Age,Comment
Mark,Watney,21,zażółć gęślą jaźń
Melissa,Lewis,21.5,"Some, comment"
,,"21,5",Some; Comment
$ file standard.csv
standard.csv: CSV text

$ cat standard.csv
Firstname,Lastname,Age,Comment
Mark,Watney,21,za_?__ g__l_ ja__
Melissa,Lewis,21.5,"Some, comment"
,,"21,5",Some; Comment
$ file dos.csv
dos.csv: CSV text

$ cat dos.csv
Firstname,Lastname,Age,Comment
Mark,Watney,21,za_?__ g__l_ ja__
Melissa,Lewis,21.5,"Some, comment"
,,"21,5",Some; Comment
$ file macintosh.csv
macintosh.csv: Non-ISO extended-ASCII text, with CR line terminators

$ cat macintosh.csv
,,"21,5",Some; Comment

7.1.12. Good Practices

Always specify:

  • delimiter=',' to csv.DictReader() object

  • quotechar='"' to csv.DictReader() object

  • quoting=csv.QUOTE_ALL to csv.DictReader() object

  • lineterminator='\n' to csv.DictReader() object

  • encoding='utf-8' to open() function (especially when working with Microsoft Excel)

7.1.13. Assignments

Code 7.9. Solution
"""
* Assignment: CSV Format ReadString
* Complexity: easy
* Lines of code: 4 lines
* Time: 5 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Do not convert numeric values to `float`, leave them as `str`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Nie konwertuj wartości numerycznych do `float`, zostaw jako `str`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.splitlines()`
    * `str.strip()`
    * `str.split()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
     ('5.8', '2.7', '5.1', '1.9', 'virginica'),
     ('5.1', '3.5', '1.4', '0.2', 'setosa'),
     ('5.7', '2.8', '4.1', '1.3', 'versicolor')]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

# data from file (note the list[tuple] format!)
# type: list[tuple]
result = ...

Code 7.10. Solution
"""
* Assignment: CSV Format ReadSwitch
* Complexity: easy
* Lines of code: 6 lines
* Time: 5 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Substitute last element (class label) with value from `LABEL_ENCODER`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Podmień ostatni element (etykietę klasową) z wartością z `LABEL_ENCODER`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.splitlines()`
    * `str.strip()`
    * `str.split()`
    * `dict.get()`
    * `list() + list()`
    * `list.append()`
    * `tuple()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
     ('5.8', '2.7', '5.1', '1.9', 'virginica'),
     ('5.1', '3.5', '1.4', '0.2', 'setosa'),
     ('5.7', '2.8', '4.1', '1.3', 'versicolor')]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,0
5.1,3.5,1.4,0.2,1
5.7,2.8,4.1,1.3,2"""

LABEL_ENCODER = {
    '0': 'virginica',
    '1': 'setosa',
    '2': 'versicolor'}

# data from file (note the list[tuple] format!)
# type: list[tuple]
result = ...

Code 7.11. Solution
"""
* Assignment: CSV Format ReadLabelEncoder
* Complexity: medium
* Lines of code: 10 lines
* Time: 13 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Generate `LABEL_ENCODER: dict[int,str]` from `header: list[str]`
    3. Substitute last element (class label) with value from `LABEL_ENCODER`
    4. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Wygeneruj `LABEL_ENCODER: dict[int,str]` z `header: list[str]`
    3. Podmień ostatni element (etykietę klasową) z wartością z `LABEL_ENCODER`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `dict(enumerate())`
    * `str.strip()`
    * `str.split()`
    * `dict.get()`
    * `int()`
    * `list() + list()`
    * `list.append()`
    * `tuple()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [('5.8', '2.7', '5.1', '1.9', 'virginica'),
     ('5.1', '3.5', '1.4', '0.2', 'setosa'),
     ('5.7', '2.8', '4.1', '1.3', 'versicolor')]
"""

DATA = """3,4,setosa,virginica,versicolor
5.8,2.7,5.1,1.9,1
5.1,3.5,1.4,0.2,0
5.7,2.8,4.1,1.3,2"""

# values from file (note the list[tuple] format!)
# type: list[tuple]
result = ...

Code 7.12. Solution
"""
* Assignment: CSV Format ReadTypeCast
* Complexity: easy
* Lines of code: 9 lines
* Time: 8 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Convert numeric values to `float`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Przekonwertuj wartości numeryczne do `float`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.strip()`
    * `str.split()`
    * `map()`
    * `list() + list()`
    * `list.append()`
    * `tuple()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
     (5.8, 2.7, 5.1, 1.9, 'virginica'),
     (5.1, 3.5, 1.4, 0.2, 'setosa'),
     (5.7, 2.8, 4.1, 1.3, 'versicolor')]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

# values from file (note the list[tuple] format!)
# type: list[tuple]
result = ...

Code 7.13. Solution
"""
* Assignment: CSV Format ReadFixedHeader
* Complexity: easy
* Lines of code: 5 lines
* Time: 5 min

English:
    1. Convert `DATA` to `result: list[dict]`
    2. Use `HEADER` as dict keys
    3. Do not convert numeric values to `float`, leave them as `str`
    4. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[dict]`
    2. Użyj `HEADER` jako kluczy dictów
    3. Nie konwertuj wartości numeryczne do `float`, pozostaw je jako `str`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.splitlines()`
    * `str.strip()`
    * `str.split()`
    * `dict(zip())`
    * `list.append()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is dict for x in result), \
    'All rows in `result` should be dict'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [{'sepal_length': '5.8', 'sepal_width': '2.7', 'petal_length': '5.1',
      'petal_width': '1.9', 'species': 'virginica'},
     {'sepal_length': '5.1', 'sepal_width': '3.5', 'petal_length': '1.4',
      'petal_width': '0.2', 'species': 'setosa'},
     {'sepal_length': '5.7', 'sepal_width': '2.8', 'petal_length': '4.1',
      'petal_width': '1.3', 'species': 'versicolor'}]
"""

DATA = """5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

HEADER = [
    'sepal_length',
    'sepal_width',
    'petal_length',
    'petal_width',
    'species',
]

# Replace keys with `HEADER`
# type: list[dict[str,str]]
result = ...

Code 7.14. Solution
"""
* Assignment: CSV Format ReadGenerateHeader
* Complexity: easy
* Lines of code: 7 lines
* Time: 8 min

English:
    1. Generate `header: list[str]` from first line `DATA`
    2. Convert `DATA` to `result: list[dict]`
    3. Use `header` as keys
    4. Do not convert numeric values to `float`, leave them as `str`
    5. Run doctests - all must succeed

Polish:
    1. Wygeneruj `header: list[str]` z pierwszej linii `DATA`
    2. Przekonwertuj `DATA` to `result: list[dict]`
    3. Użyj nagłówka jako kluczy
    4. Nie konwertuj wartości numeryczne do `float`, pozostaw je jako `str`
    5. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.strip()`
    * `str.split()`
    * `map()`
    * `list() + list()`
    * `list.append()`
    * `tuple()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is dict for x in result), \
    'All rows in `result` should be dict'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [{'sepal_length': '5.8', 'sepal_width': '2.7', 'petal_length': '5.1',
      'petal_width': '1.9', 'species': 'virginica'},
     {'sepal_length': '5.1', 'sepal_width': '3.5', 'petal_length': '1.4',
      'petal_width': '0.2', 'species': 'setosa'},
     {'sepal_length': '5.7', 'sepal_width': '2.8', 'petal_length': '4.1',
      'petal_width': '1.3', 'species': 'versicolor'}]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

# replace fieldnames with `FIELDNAMES`
# type: list[dict]
result = ...

Code 7.15. Solution
"""
* Assignment: CSV Format WriteListDict
* Complexity: easy
* Lines of code: 4 lines
* Time: 5 min

English:
    1. Convert `DATA` to CSV as `result: str`:
       a. do not add header
       a. firstname - first field
       c. lastname - second field
    2. Non-functional requirements:
       a. Do not use `import` and any module
       b. Quotechar: None
       c. Quoting: None
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` do CSV jako `result: str`:
       a. nie dodawaj nagłówka
       b. imię - pierwsze pole
       c. nazwisko - drugie pole
    2. Wymagania niefunkcjonalne:
       a. Nie używaj `import` ani żadnych modułów
       b. Quotechar: None
       c. Quoting: None
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is str, \
    'Variable `result` has invalid type, should be str'

    >>> print(result)   # doctest: +NORMALIZE_WHITESPACE
    Pan,Twardowski
    Rick,Martinez
    Mark,Watney
    Ivan,Ivanovic
    Melissa,Lewis
    <BLANKLINE>
"""

DATA = [
    {'firstname': 'Pan', 'lastname': 'Twardowski'},
    {'firstname': 'Rick', 'lastname': 'Martinez'},
    {'firstname': 'Mark', 'lastname': 'Watney'},
    {'firstname': 'Ivan', 'lastname': 'Ivanovic'},
    {'firstname': 'Melissa', 'lastname': 'Lewis'},
]

# multiline string with `firstname,lastname` pairs
# type: str
result = ...

Code 7.16. Solution
"""
* Assignment: CSV Format WriteFixed
* Complexity: medium
* Lines of code: 5 lines
* Time: 5 min

English:
    1. Convert `DATA` to CSV as `result: str`:
       a. add header
       a. firstname - first field
       c. lastname - second field
    2. Non-functional requirements:
       a. Do not use `import` and any module
       b. Quotechar: `"`
       c. Quoting: always
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` do CSV jako `result: str`:
       a. dodaj nagłówek
       b. imię - pierwsze pole
       c. nazwisko - drugie pole
    2. Wymagania niefunkcjonalne:
       a. Nie używaj `import` ani żadnych modułów
       b. Quotechar: `"`
       c. Quoting: zawsze
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is str, \
    'Variable `result` has invalid type, should be str'

    >>> print(result)   # doctest: +NORMALIZE_WHITESPACE
    "firstname","lastname"
    "Pan","Twardowski"
    "Rick","Martinez"
    "Mark","Watney"
    "Ivan","Ivanovic"
    "Melissa","Lewis"
    <BLANKLINE>
"""

DATA = [
    {'firstname': 'Pan', 'lastname': 'Twardowski'},
    {'firstname': 'Rick', 'lastname': 'Martinez'},
    {'firstname': 'Mark', 'lastname': 'Watney'},
    {'firstname': 'Ivan', 'lastname': 'Ivanovic'},
    {'firstname': 'Melissa', 'lastname': 'Lewis'},
]

# multiline string with header and `"firstname","lastname"` pairs
# type: str
result = ...

Code 7.17. Solution
"""
* Assignment: CSV Format WriteSchemaless
* Complexity: medium
* Lines of code: 13 lines
* Time: 13 min

English:
    1. Define `header: str` with sorted list of unique keys from `DATA`
    2. `header` must be automatically generated from `DATA`
    3. Iterate over `DATA` and extract values for each header column
    4. Define `result: str` with header and matching values
    5. Non-functional requirements:
       a. Do not use `import` and any module
       b. Quotechar: `"`
       c. Quoting: always
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    6. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `header: str` z posortowaną listą unikalnych kluczy z `DATA`
    2. `header` musi być generowany automatycznie z `DATA`
    3. Iteruj po `DATA` i wyciągnij wartości dla każdej kolumny z nagłówka
    4. Zdefiniuj `result: str` z nagłówkiem i pasującymi wartościami
    5. Wymagania niefunkcjonalne:
       a. Nie używaj `import` ani żadnych modułów
       b. Quotechar: `"`
       c. Quoting: zawsze
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    6. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is str, \
    'Variable `result` has invalid type, should be str'

    >>> print(result)
    "Petal length","Petal width","Sepal length","Sepal width","Species"
    "","","5.1","3.5","setosa"
    "4.1","1.3","","","versicolor"
    "","1.8","6.3","","virginica"
    "","0.2","5.0","","setosa"
    "4.1","","","2.8","versicolor"
    "","1.8","","2.9","virginica"
    <BLANKLINE>
"""

DATA = [
    {'Sepal length': 5.1, 'Sepal width': 3.5, 'Species': 'setosa'},
    {'Petal length': 4.1, 'Petal width': 1.3, 'Species': 'versicolor'},
    {'Sepal length': 6.3, 'Petal width': 1.8, 'Species': 'virginica'},
    {'Sepal length': 5.0, 'Petal width': 0.2, 'Species': 'setosa'},
    {'Sepal width': 2.8, 'Petal length': 4.1, 'Species': 'versicolor'},
    {'Sepal width': 2.9, 'Petal width': 1.8, 'Species': 'virginica'},
]

# header has unique keys from DATA, row values match header columns
# type: str
result = ...

Code 7.18. Solution
"""
* Assignment: CSV Format WriteListTuple
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min

English:
    1. Define `result: str` with `DATA` converted to CSV format
    2. Non-functional requirements:
       a. Do not use `import` and any module
       b. Quotechar: None
       c. Quoting: never
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: str` z `DATA` przekonwertowaną do formatu CSV
    2. Wymagania niefunkcjonalne:
       a. Nie używaj `import` ani żadnych modułów
       b. Quotechar: None
       c. Quoting: nigdy
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is str, \
    'Variable `result` has invalid type, should be str'

    >>> print(result)
    SepalLength,SepalWidth,PetalLength,PetalWidth,Species
    5.8,2.7,5.1,1.9,virginica
    5.1,3.5,1.4,0.2,setosa
    5.7,2.8,4.1,1.3,versicolor
    6.3,2.9,5.6,1.8,virginica
    6.4,3.2,4.5,1.5,versicolor
    4.7,3.2,1.3,0.2,setosa
    7.0,3.2,4.7,1.4,versicolor
    7.6,3.0,6.6,2.1,virginica
    4.9,3.0,1.4,0.2,setosa
    <BLANKLINE>
"""

DATA = [
    ('SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
    (7.0, 3.2, 4.7, 1.4, 'versicolor'),
    (7.6, 3.0, 6.6, 2.1, 'virginica'),
    (4.9, 3.0, 1.4, 0.2, 'setosa')]

# DATA converted to CSV format
# type: str
result = ...

Code 7.19. Solution
"""
* Assignment: CSV Format WriteListDict
* Complexity: medium
* Lines of code: 7 lines
* Time: 8 min

English:
    1. Define `result: str` with `DATA` converted to CSV format
    2. Non-functional requirements:
       a. Do not use `import` and any module
       b. Quotechar: None
       c. Quoting: never
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: str` z `DATA` przekonwertowaną do formatu CSV
    2. Wymagania niefunkcjonalne:
       a. Nie używaj `import` ani żadnych modułów
       b. Quotechar: None
       c. Quoting: nigdy
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `vars(obj)`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is str, \
    'Variable `result` has invalid type, should be str'

    >>> print(result)
    sepal_length,sepal_width,petal_length,petal_width,species
    5.1,3.5,1.4,0.2,setosa
    5.8,2.7,5.1,1.9,virginica
    5.1,3.5,1.4,0.2,setosa
    5.7,2.8,4.1,1.3,versicolor
    6.3,2.9,5.6,1.8,virginica
    6.4,3.2,4.5,1.5,versicolor
    <BLANKLINE>
"""

DATA = [{'sepal_length': 5.1, 'sepal_width': 3.5, 'petal_length': 1.4,
         'petal_width': 0.2, 'species': 'setosa'},
        {'sepal_length': 5.8, 'sepal_width': 2.7, 'petal_length': 5.1,
         'petal_width': 1.9, 'species': 'virginica'},
        {'sepal_length': 5.1, 'sepal_width': 3.5, 'petal_length': 1.4,
         'petal_width': 0.2, 'species': 'setosa'},
        {'sepal_length': 5.7, 'sepal_width': 2.8, 'petal_length': 4.1,
         'petal_width': 1.3, 'species': 'versicolor'},
        {'sepal_length': 6.3, 'sepal_width': 2.9, 'petal_length': 5.6,
         'petal_width': 1.8, 'species': 'virginica'},
        {'sepal_length': 6.4, 'sepal_width': 3.2, 'petal_length': 4.5,
         'petal_width': 1.5, 'species': 'versicolor'}]

# DATA converted to CSV format
# type: str
result = ...

Code 7.20. Solution
"""
* Assignment: CSV Format WriteObjects
* Complexity: medium
* Lines of code: 7 lines
* Time: 8 min

English:
    1. Define `result: str` with `DATA` converted to CSV format
    2. Non-functional requirements:
       a. Do not use `import` and any module
       b. Quotechar: None
       c. Quoting: never
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: str` z `DATA` przekonwertowaną do formatu CSV
    2. Wymagania niefunkcjonalne:
       a. Nie używaj `import` ani żadnych modułów
       b. Quotechar: None
       c. Quoting: nigdy
       d. Delimiter: `,`
       e. Lineseparator: `\n`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `vars(obj)`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is str, \
    'Variable `result` has invalid type, should be str'

    >>> print(result)
    sepal_length,sepal_width,petal_length,petal_width,species
    5.1,3.5,1.4,0.2,setosa
    5.8,2.7,5.1,1.9,virginica
    5.1,3.5,1.4,0.2,setosa
    5.7,2.8,4.1,1.3,versicolor
    6.3,2.9,5.6,1.8,virginica
    6.4,3.2,4.5,1.5,versicolor
    <BLANKLINE>
"""


class Iris:
    def __init__(self, sepal_length, sepal_width,
                 petal_length, petal_width, species):
        self.sepal_length = sepal_length
        self.sepal_width = sepal_width
        self.petal_length = petal_length
        self.petal_width = petal_width
        self.species = species


DATA = [Iris(5.1, 3.5, 1.4, 0.2, 'setosa'),
        Iris(5.8, 2.7, 5.1, 1.9, 'virginica'),
        Iris(5.1, 3.5, 1.4, 0.2, 'setosa'),
        Iris(5.7, 2.8, 4.1, 1.3, 'versicolor'),
        Iris(6.3, 2.9, 5.6, 1.8, 'virginica'),
        Iris(6.4, 3.2, 4.5, 1.5, 'versicolor')]

# DATA converted to CSV format
# type: str
result = ...