Monday, June 16, 2014

13 Things You Didn’t Know You Could Do with Pig

You may not know it, but Pig lives all around you. LinkedIn, Twitter, Netflix, Salesforce… These internet giants (and many others) all use Apache Pig to help make sense of the massive amounts of data they generate on a daily basis.

It’s relatively well known that Pig is great for working with unstructured data (Pigs Eat Anything, per the official Apache Pig Philosophy), that it’s flexible and extensible (Pigs Are Domestic Animals), and that it sails through massive data sets with ease (Pigs Fly). That’s all true, but we’ve also stumbled onto several cool features of Pig that aren’t as well known. We compiled the list below to share some of the Piggy goodness.
  1. You Can Write In-Line GROUP BY and JOIN
  2. You Can Cast a Relation as a Scalar
  3. You Can Write UDFs in JavaScript
  4. You Can Set Hadoop Properties
  5. You Can Use Nested FOREACH Statements
  6. You Can Reuse Code Using Macros and Macro Libraries
  7. You Can Store into a Database
  8. You Can Grab a Sample Dataset
  9. You Can Stand on the Shoulders of UDF Giants
  10. You Can Use Shorthand
  11. You Can JOIN More Than Two Tables at Once
  12. You Can Run Shell Commands
  13. You Can (and Probably Should) Wear Hearing Protection
Read more here

Leave a Reply

All Tech News IN © 2011 & Main Blogger .