About 6,240,000 results
Open links in new tab
  1. file - What are the pros and cons of the Apache Parquet format …

    Apr 24, 2016 · Some characteristics of Apache Parquet are: Self-describing Columnar format Language-independent In comparison to Apache Avro, Sequence Files, RC File etc. I want an …

  2. Reading / Fixing a corrupt parquet file - Stack Overflow

    Sep 3, 2024 · Either the file is corrupted or this is not a parquet file. when I tried to construct a ParquetFile instance. I assume appending PAR1 to the end of the file could help this?

  3. How to append new data to an existing parquet file?

    Apr 27, 2023 · I have parquet files with some data in them. I want to add more data to them frequently every day. I want to do this without having to load the object to memory and then …

  4. Extension of Apache parquet files, is it '.pqt' or '.parquet'?

    Oct 19, 2021 · 41 I wonder if there is a consensus regarding the extension of parquet files. I have seen a shorter .pqt extension, which has typical 3-letters (like in csv, tsv, txt, etc) and then …

  5. Spark parquet partitioning : Large number of files

    Jun 28, 2017 · Spark 2.2+ From Spark 2.2 on, you can also play with the new option maxRecordsPerFile to limit the number of records per file if you have too large files. You will …

  6. Python: save pandas data frame to parquet file - Stack Overflow

    Jan 29, 2019 · Is it possible to save a pandas data frame directly to a parquet file? If not, what would be the suggested process? The aim is to be able to send the parquet file to another …

  7. How to read a Parquet file into Pandas DataFrame?

    How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure such as Hadoop or Spark? This is only a …

  8. How to append data to an existing parquet file - Stack Overflow

    Aug 31, 2016 · Write multiple parquet files. Then combine them at a later stage. Write multiple parquet files. The tool you are using to read the parquet files may support reading multiple files …

  9. Methods for writing Parquet files using Python? - Stack Overflow

    Oct 5, 2015 · I'm having trouble finding a library that allows Parquet files to be written using Python. Bonus points if I can use Snappy or a similar compression mechanism in conjunction …

  10. Spark save (write) parquet only one file - Stack Overflow

    Aug 1, 2018 · I have a Parquet directory with 20 parquet partitions (=files) and it takes 7 seconds to write the files. When using coalesce(1), it takes 21 seconds to write the single Parquet file. …