Is Hadoop the new stored procedure ?

Two things happened today for me to ask this question and am not offering any serious answer, yet !

  • First I had some quick chat with folks at MongoDB and as a result got me thinking about the MapReduce in Mongo and where could it go. MongoDB also has the new declarative aggregation framework.
  • My thesis is that, while now the MongoDB aggregation framework is JSON semantics+$ keywords, it could look a lot like a functional programming language – with high-order declarative functions like map/reduce, discriminated unions (like F#) and currying.
  • And later in the day I read Edd’s blog “5 Big Data Predictions”, also in Forbes. (While both are the same blog, there might ne interesting comments in each)
  • Lots of interesting observations from Edd. He is predicting better programming language support, but may be we are looking at it the wrong way – what we need is a better stored procedure support in the data layer. It also could the next point Edd was talking about-Streaming data processing ! Where best could we have that feature than at the data layer ?
  • Would we be able to write a social science data platform using the MongoDB aggregators ? Would MongoDB mapReduce fit the bill now ? If not, what would it take to make it so ?
  • There are two obvious paths – connector to an application artifact for example Hadoop connector or embed the map/reduce in the data layer. Both have their advantages and disadvantages. With the connector the mapReduce can scale orthogonally, but with the embedded feature, one can achieve real-time processing (within limits). May be this is the time for an application data store !
  • Would the datastores like MongoDB gain features like the Twitter Storm, Real-Time map reduce, hierarchical iterative functional aggregators  and so forth ?
  • GreenPlum’s Chorus is interesting – Can NOSQL datastores gain some of the relevant capabilities that Chorus has?

Finally, the beginning as the end,

  1. Is hadoop the new stored procedure or would the new stored procedures look like Hadoop ?
  2. Is Data and Application becoming inseparable at scale ?
  3. What says thee?

One thought on “Is Hadoop the new stored procedure ?

  1. Pingback: @godwin - me myself and i

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s