fastavro
========


The current Python `avro` package is packed with features but dog slow.

On a test case of about 10K records, it takes about 14sec to iterate over all of
them. In comparison the JAVA `avro` SDK does it in about 1.9sec.

`fastavro` is less feature complete than `avro`, however it's much faster. It
iterates over the same 10K records in 2.9sec, and if you use it with PyPy it'll
do it in 1.5sec (to be fair, the JAVA benchmark is doing some extra JSON
encoding/decoding).

If the optional C extension (generated by `Cython`_) is available, then
`fastavro` will be even faster. For the same 10K records it'll run in about
1.7sec.

.. _`Cython`: http://cython.org/

Example
-------

::

    # Writing
    from fastavro import writer

    schema = {
        'doc': 'A weather reading.',
        'name': 'Weather',
        'namespace': 'test',
        'type': 'record',
        'fields': [
            {'name': 'station', 'type': 'string'},
            {'name': 'time', 'type': 'long'},
            {'name': 'temp', 'type': 'int'},
        ],
    }

    # 'records' can be an iterable (including generator)
    records = [
        {u'station': u'011990-99999', u'temp': 0, u'time': 1433269388},
        {u'station': u'011990-99999', u'temp': 22, u'time': 1433270389},
        {u'station': u'011990-99999', u'temp': -11, u'time': 1433273379},
        {u'station': u'012650-99999', u'temp': 111, u'time': 1433275478},
    ]

    with open('weather.avro', 'wb') as out:
        writer(out, schema, records)

    # Reading
    import fastavro

    with open('weather.avro', 'rb') as fo:
        reader = fastavro.reader(fo)
        schema = reader.schema

        for record in reader:
            print(record)


Documentation
-------------

.. toctree::
   :maxdepth: 1

   reader
   writer
   validation
   command_line_script

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

