Let’s kick off this blog post with a numbers game, shall we? The answer is: 50+ women, who also happen to be data geeks!
Doing what, you’re wondering… that is the number of women who participated in the most recent technical training session organized by Women in Big Data forum. This all-day technical workshop on Apache Drill and Apache Spark was sponsored by MapR and held at their headquarters in San Jose on May 4th, 2016.
In the morning, we learned about the essentials of Apache Drill, including these topics:
• SQL on Hadoop landscape & where does Drill fit in
• Introduction to Apache Drill
• How Drill achieves flexibility & performance – Architecture overview
Interactive demos included:
• Using Drill to query files, Hive tables & HBase/MapR-DB
• ANSI SQL functionality (including queries on JSON, Parquet)
• Working with nested data using Drill
• Using Tableau (BI tools) with Drill
In the afternoon, we learned about the essentials of Apache Spark:
Introduction to Apache Spark
• Describe the features of Apache Spark
• Advantages of Spark
• How Spark fits in with the Big Data application stack
• How Spark fits in with Hadoop
• Define Apache Spark components
Load and Inspect Data in Apache Spark
• Describe different ways of getting data into Spark
• Create and use Resilient Distributed Datasets (RDDs)
• Apply transformation to RDDs
• Use actions on RDDs
The training session concluded with a demo of how to write a simple Spark application in Java.
Thanks to these speakers from MapR for their time and effort and making this a successful training event (greatly appreciated by WiBD members):
• Neeraja Rentachintala, Sr Director Product Management at MapR – Drill Session (Overview)
• Andries Engelbrecht, Partner Systems Engineer at MapR – Drill Session (BI connection)
• James Casaletto, Senior Curriculum Developer at MapR – Spark Session
What stood out for me after the morning training session on Apache Drill, as an ex-DBA I felt a little loss of power that typically comes with the role. On the bright side, from another perspective (wearing the Analyst/BI hat), I felt empowered and equipped to be more productive in a short amount of time. I am certain that once the training session was over, a lot of us couldn’t wait to get our hands dirty with the tools that were shared with us. You can access the slides here.
Moreover, during the lunch break, we heard Apache committer Ellen Friedman speak to us about how the organizational culture is changing as the big data analytics tools evolve. We also got a glimpse of MapR culture from their HR, along with hiring opportunities. A common theme with past WiBD meetups was the presence of high energy amongst members and ample opportunity to network with each other.
Heartfelt thanks to Alicia Alvarez from MapR for being the glue that held everything together for this event on behalf of WiBD. And my fellow WiBD training committee members, especially Radhika and Yulia for all their hard work and making this technical training opportunity possible for the WiBD community members. I look forward to contributing with you for similar future events!