Assessment of big data processing efficiency with SPARK and HIVE technology

Authors:

Pavych N., Krochmalna O.

Національний університет “Львівська політехніка”, кафедра програмного забезпечення

In this paper the contemporary technology to big data processing is analyzed. The software solution on Hadoop is developed. And the comparative results of the time efficiency in big data processing with Spark or Hive are described. The approaches to implement the software systems for big data processing with Spark or Hive are suggested.

1. Karen Montgomery. Big Data Now: 2014 Edition. O'Reilly Media. – Junuary, 2015 – 165 p. 2. White T. Hadoop: The Definitive Guide, 4th Edition. O'Reilly Media. – March, 2015 – 756 p. 3. Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia. Learning Spark. O'Reilly Media. – February 2015 – 276 p. 4. Edward Capriolo, Dean Wampler, Jason Rutherglen. Programming Hive. O'Reilly Media. – September, 2014. – 365 p. 5. Karau, H. Fast Data Processing With Spark. – Packt Publishing, 2013. – 120 p. 6. Gonzalez Joseph, Xin Reynold, Dave Ankur, Crankshaw Daniel, Franklin Michael, Stoica Ion (Oct 2014). GraphX: Graph Processing in a Distributed Dataflow Framework. 7. How to Process Data with Apache Hive [Електронний ресурс] // Hortonworks: сайт. – Режим доступу: http://hortonworks.com/hadooptutorial/how-to-process-data-with-apache-hive/ 8. Bi-Annual Data Exposition. [Електронний ресурс] // Statistical Computing: сайт. – Режим доступу: http://stat-computing.org/dataexpo/