PAUS-photoz bcnz v36

From Public PIC Wiki
Revision as of 13:00, 6 February 2020 by Jcarrete (talk | contribs) (Created page with "'''This is the way we ingest PAUS photo-z releases in CosmoHub and PAUdb''' This is the case for file: /nfs/astro/eriksen/kcorr/bcnz_v36.parquet 1. Create a new product...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This is the way we ingest PAUS photo-z releases in CosmoHub and PAUdb

This is the case for file:

   /nfs/astro/eriksen/kcorr/bcnz_v36.parquet

1. Create a new production in the db.pau.pic.es in the production table.

In this case we create production = 953, with input_production_id = 944. Some more information is needed.

2. There is a python notebook:

   /nfs/user/j/jcarrete/python_notebooks/PAUdm photoz parquet files.ipynb

Where we convert the super large parquet file with many fields into a csv fields with the needed fields:

   /cephfs/pic.es/astro/scratch/jcarrete/sandbox/parquet_bcnz_v36_some_fields.csv

3. Ingest into paudb:

tail -n +2 filename.csv | awk -F, '{print 866,$6,$7,$4,$5,$8,$1,$3,$2}' | tr ' ' ',' > filename_reordered.csv

I assume I have a CSV (comma separated value) file with header, and with the fields are ordered in the same way as in the database model:

tail -n +2 filename.csv | pv -l | psql -U postgres -h db01.pau.pic.es dm - c "COPY paudm.photoz_bcnz FROM STDIN DELIMITER ',' CSV"Once the csv is created