I hear this phrase often about SQL not being composable, but is this really true? What do you mean by that exactly? It feels composable as long as you’re within the scope of SQL.
-
-
Replying to @schrepfler @drbridgewater
you can't really share subqueries in SQL. Like, you want to be able to import with/as from a library. But also, you need a way to bind some local table to a name, then call the library function on the tables... (functions on tables).
2 replies 0 retweets 1 like -
That's something dbt tries to do. Mostly through adding some configs and jinja templates. I suspect it's not exactly as clean as composing scala types and functions.
2 replies 0 retweets 3 likes -
Replying to @squarecog @posco and
What is the advantage of SQL over say, Spark's API?
1 reply 0 retweets 0 likes -
Orders of magnitude more people already familiar with it. Trivial onramp for basic functionality. Loads of tools that use sql to get their data out. And sufficiently high abstraction to enable many optimizations.
2 replies 0 retweets 5 likes -
Replying to @squarecog @runT1ME and
(I don't know about the spark api but Oscar can confirm that when we moved from pig to scalding, being code-first made some optimizations, like predicate and projection pushdown, much harder to implement).
1 reply 0 retweets 1 like -
Replying to @squarecog @runT1ME and
Yeah the spark RDD api isn’t terrible (I stand by scalding being better) but you really want an optimizing compiler for your data jobs and a library has a hard time doing that. Spark sql can do this with stringly typed code.
1 reply 0 retweets 3 likes -
Replying to @posco @squarecog and
I’m pretty baffled that databricks hasn’t made a spark compiler plugin that could solve the issue. It could be really great to have great types and an optimizer that runs on your entire flow. I guess they are focused on their python users though and that wouldn’t help them.
3 replies 0 retweets 2 likes -
Replying to @posco @squarecog and
Spark suffers in a lot of ways from supporting a Java API (which nobody uses) and a python API (which everybody uses). Any Scala-only solution is a non-starter for them, which is a shame because it could be so much better.
1 reply 1 retweet 4 likes -
Replying to @jeremyrsmith @squarecog and
I don’t know why a scala only API is a non-starter? Why not have a common DAG computation engine and then particularize to python and scala as beat you can? Also, the scala APIs already use implicits which are painful to use from java.
2 replies 0 retweets 3 likes
i suspect trying to make the APIs consistent-ish between languages was a goal, and thus why a lot of scala programmers aren't super happy with it. a new set of APIs targeting Spark SQL/Catalyst could be viable, i think? frameless does this but i'd want to cut out shapeless
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.