Difference between revisions of "PAUS-photoz bcnz v36"

Latest revision as of 13:00, 6 February 2020

This is the way we ingest PAUS photo-z releases in CosmoHub and PAUdb

This is the case for file:

   /nfs/astro/eriksen/kcorr/bcnz_v36.parquet

1. Create a new production in the db.pau.pic.es in the production table.

In this case we create production = 953, with input_production_id = 944. Some more information is needed.

2. There is a python notebook:

   /nfs/user/j/jcarrete/python_notebooks/PAUdm photoz parquet files.ipynb

Where we convert the super large parquet file with many fields into a csv fields with the needed fields:

   /cephfs/pic.es/astro/scratch/jcarrete/sandbox/parquet_bcnz_v36_some_fields.csv

3. Ingest into paudb:

I assume I have a CSV (comma separated value) file with header, and with the fields are ordered in the same way as in the database model:

   tail -n +2 filename.csv | pv -l | psql -U postgres -h db.pau.pic.es dm -c "COPY photoz_bcnz FROM STDIN DELIMITER ',' CSV"

@@ Line 18: / Line 18: @@
 . Ingest into paudb:
-tail -n +2 filename.csv | awk -F, '{print 866,$6,$7,$4,$5,$8,$1,$3,$2}' | tr ' ' ',' > filename_reordered.csv
 I assume I have a CSV (comma separated value) file with header, and with the fields are ordered in the same way as in the database model:
-tail -n +2 filename.csv | pv -l | psql -U postgres -h db01.pau.pic.es dm - c "COPY paudm.photoz_bcnz FROM STDIN DELIMITER ',' CSV"Once the csv is created
+    tail -n +2 filename.csv | pv -l | psql -U postgres -h db.pau.pic.es dm -c "COPY photoz_bcnz FROM STDIN DELIMITER ',' CSV"