Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Data engineering pipeline is basically ETL. SQL is for running reports.

ETL can have many variances, especially with so many different kinds of data source, structured and unstructured. A new kind of data source (e.g. video) would require a new way to extract useful data, transform it into some useful forms (e.g. products mentioned in video), and load them into the common stores (e.g. RDBMS). Then can use SQL to manipulate the data and run reports.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: