There will be no singularity
https://github-contributions.vercel.app
Forgot to post the year in review :)
SQL-WTF TIL (Snowflake edition)
WHAT FIELDS WILL I SEE AS A RESULT OF A SELECT QUERY THAT RETRIEVES MULTIPLE FIELDS WITH IDENTICAL NAMES?
Hold on before answering; it's not all that obvious 🙂
Let me explain what the answer depends on:
The type of client from which the query is made.
- Whether we are using USING or not.
- Whether there is data in the sources or not.
- Whether we are looking directly at the result of the executed query or at the result from the history using RESULT_SCAN.
Cool, right? Now, I'll briefly explain each point 🙂
Client Type
Database client creators love adding spices to the work process of data engineers 🙂 All examples discussed were executed in Snowsight (in SnowSQL, the result will be different). To be fair, it's not always clear which part of the described chaos is attributed to the client and which to the database itself. But we won't delve into that now…
The creators decided that when more than one column with the same name is encountered in the result, it would be a good idea to rename them! A, A1, A2… The idea is great, but how do I now understand if these are actually existing columns or not?
In reality, such columns do not exist. You cannot refer to them in SQL.
USING
USING is syntactic sugar to shorten the ON clause. But it affects not only ON, which might seem logical, but also the list of columns.
It is implied that the result should be identical to:
and we should get A, B, A, B (or the renamed version described above).
But no! USING removes one column A. Really, why do you need 2 identical columns that will break everything for you later? So, it should be A, B, B.
Everything is fine here. But what if we continue to use JOIN with USING?
Do you think that each time another column B will be added? Nope! 🙂 All subsequent JOIN USING with identically named columns will simply be ignored! Moreover, this will happen until a JOIN with ON is encountered!
will return A B B2 A2 B_3
Empty and Non-Empty Tables
Yes, Snowflake (Snowsight) renames columns differently depending on whether there is data in the tables or not!
RESULT_SCAN
This function returns the result of a query by its UUID in QUERY_HISTORY:
But now, A1 and B1 are actual columns!
More examples are here: Github
Thumbs up here: Linkedin
WHAT FIELDS WILL I SEE AS A RESULT OF A SELECT QUERY THAT RETRIEVES MULTIPLE FIELDS WITH IDENTICAL NAMES?
Hold on before answering; it's not all that obvious 🙂
Let me explain what the answer depends on:
The type of client from which the query is made.
- Whether we are using USING or not.
- Whether there is data in the sources or not.
- Whether we are looking directly at the result of the executed query or at the result from the history using RESULT_SCAN.
Cool, right? Now, I'll briefly explain each point 🙂
Client Type
Database client creators love adding spices to the work process of data engineers 🙂 All examples discussed were executed in Snowsight (in SnowSQL, the result will be different). To be fair, it's not always clear which part of the described chaos is attributed to the client and which to the database itself. But we won't delve into that now…
The creators decided that when more than one column with the same name is encountered in the result, it would be a good idea to rename them! A, A1, A2… The idea is great, but how do I now understand if these are actually existing columns or not?
In reality, such columns do not exist. You cannot refer to them in SQL.
USING
USING is syntactic sugar to shorten the ON clause. But it affects not only ON, which might seem logical, but also the list of columns.
CREATE TABLE T1 (A INT, B INT);
SELECT
*
FROM T1
LEFT JOIN T1 as T1_1 USING(A);
It is implied that the result should be identical to:
SELECT
*
FROM T1
LEFT JOIN T1 as T1_1 ON T1.A = T1_1.A
and we should get A, B, A, B (or the renamed version described above).
But no! USING removes one column A. Really, why do you need 2 identical columns that will break everything for you later? So, it should be A, B, B.
Everything is fine here. But what if we continue to use JOIN with USING?
SELECT
*
FROM T1
LEFT JOIN T1 as T1_1 USING(A)
LEFT JOIN T1 as T1_2 USING(A)
...
Do you think that each time another column B will be added? Nope! 🙂 All subsequent JOIN USING with identically named columns will simply be ignored! Moreover, this will happen until a JOIN with ON is encountered!
SELECT
*
FROM T1
LEFT JOIN T1 as T1_1 USING(A)
LEFT JOIN T1 as T1_2 USING(A)
LEFT JOIN T1 as T1_3 ON T1.A = T1_3.A
will return A B B2 A2 B_3
Empty and Non-Empty Tables
Yes, Snowflake (Snowsight) renames columns differently depending on whether there is data in the tables or not!
-- EMPTY TABLES
SELECT * FROM t1 CROSS JOIN t1 a;
-- A B A_2 B_2
-- TABLE T1 IS NOT EMPTY
SELECT * FROM t1 CROSS JOIN t1 a;
-- A B A B
RESULT_SCAN
This function returns the result of a query by its UUID in QUERY_HISTORY:
SELECT * FROM t1 CROSS JOIN t1 a;
-- A B A_2 B_2
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()));
-- A B A_1 B_1
But now, A1 and B1 are actual columns!
SELECT A_1 FROM
(
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()))
);
-- OK
More examples are here: Github
Thumbs up here: Linkedin
Media is too big
VIEW IN TELEGRAM
We at dwh.dev decided to make some vertical videos (how do you do, fellow kids?).
What do you think?
Like it on YouTube, please.
What do you think?
Like it on YouTube, please.
There will be no singularity
Final stop! Posgresql. WASM. Browser. https://www.crunchydata.com/blog/learn-postgres-at-the-playground
GitHub
GitHub - electric-sql/pglite: Lightweight Postgres packaged as WASM into a TypeScript library for the browser, Node.js, Bun and…
Lightweight Postgres packaged as WASM into a TypeScript library for the browser, Node.js, Bun and Deno - electric-sql/pglite
You are here:
TypeScript + SQL-based OPERATING SYSTEM by Mike Stonebraker
https://www.dbos.dev/blog/announcing-dbos
TypeScript + SQL-based OPERATING SYSTEM by Mike Stonebraker
https://www.dbos.dev/blog/announcing-dbos
www.dbos.dev
Hello DBOS - Announcing DBOS Cloud
Introducing transactional serverless computing for TypeScript, enabled by a revolutionary new operating system that puts a distributed DBMS in the kernel of the OS.
2024 MAD (Machine Learning, AI & Data) Landscape
https://mattturck.com/mad2024/
PDF: https://mattturck.com/landscape/mad2024.pdf
https://mattturck.com/mad2024/
PDF: https://mattturck.com/landscape/mad2024.pdf
Matt Turck
Full Steam Ahead: The 2024 MAD (Machine Learning, AI & Data) Landscape
This is our tenth annual landscape and “state of the union” of the data, analytics, machine learning and AI ecosystem.
In 10+ years covering the space, things have never been as exciting and promising as they are today. All trends and subtrends we described…
In 10+ years covering the space, things have never been as exciting and promising as they are today. All trends and subtrends we described…
Please open Telegram to view this post
VIEW IN TELEGRAM
Friends from Luna Park are looking for a Data Engineer — could be you or someone you know (could be DS, Python dev or ML engineer if you ask me)
Palabra.ai — the first ever real-time voice interpreter: starting as a tiny team of five engineers, they built a speech-to-text solution that works 50 times faster than OpenAI's Whisper!
The interpreter prototype is releasing in a month and will work in Zoom with a two-second delay — right now it’s one of a kind!
They are now looking for a senior engineer, who will write and optimize complex multi-node data pipelines, work on large-scale data scraping and manage datasets that include hundreds of thousands of hours of audio data.
Key requirements:
🟣 5+ years of industry experience as a software developer/data engineer;
🟡 Python skills (including modern backend frameworks, low-level asyncio, multithreading/multiprocessing);
🔵 A good grasp on neural networks and related tools (NumPy, PyTorch, audio-related libraries such as torchaudio, librosa, etc);
🟢 Experience in deploying, orchestrating, and scaling multi-node pipelines in the cloud.
Nice-to-haves:
🔘 Compilated languages such as Go, C/C++, and Rust;
🔘 CUDA or other GPU-related frameworks experience;
🔘 Audio processing-related experience.
Salary is $70k-100k, plus equity up to 1%. It’s a fully remote position.
To apply or learn more, reach out to my Luna Park buddy Fedya @owlkov
Palabra.ai — the first ever real-time voice interpreter: starting as a tiny team of five engineers, they built a speech-to-text solution that works 50 times faster than OpenAI's Whisper!
The interpreter prototype is releasing in a month and will work in Zoom with a two-second delay — right now it’s one of a kind!
They are now looking for a senior engineer, who will write and optimize complex multi-node data pipelines, work on large-scale data scraping and manage datasets that include hundreds of thousands of hours of audio data.
Key requirements:
Nice-to-haves:
Salary is $70k-100k, plus equity up to 1%. It’s a fully remote position.
To apply or learn more, reach out to my Luna Park buddy Fedya @owlkov
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from Roman Agabekov
Hey friends,
We built a database performance management tool Releem that helps engineers to keep database servers fast, secure and reliable.
Why Releem:
- Automatic performance tuning
- Fast slow query identification
- Monitoring & tuning in one place
- MySQL / MariaDB & AWS RDS.
- Open-source agent
After few years development we launched on Product Hunt.
If you interested in such solutions, have some questions or feedback appreciate your support❤️ here https://www.producthunt.com/posts/releem
We built a database performance management tool Releem that helps engineers to keep database servers fast, secure and reliable.
Why Releem:
- Automatic performance tuning
- Fast slow query identification
- Monitoring & tuning in one place
- MySQL / MariaDB & AWS RDS.
- Open-source agent
After few years development we launched on Product Hunt.
If you interested in such solutions, have some questions or feedback appreciate your support❤️ here https://www.producthunt.com/posts/releem
Product Hunt
Releem - Database performance management tool for developers | Product Hunt
Releem makes managing and tuning database servers easy. ⚙️ Automatic performance tuning 🐇 Fast slow query identification ✔️ Monitoring & tuning in one place 🔀 MySQL / MariaDB & AWS RDS. 🌐 Open-source agent