The analysis of apriori algorithm for structured and unstructured data

2017;
: pp. 62 - 68

Levus Ye. V. The analysis of apriori algorithm for structured and unstructured data / Ye. V. Levus, N. I. Nechypir, Yu. V. Polyniak // Visnyk Natsionalnoho universytetu "Lvivska politekhnika". Serie: Informatsiini systemy ta merezhi. — Lviv : Vydavnytstvo Lvivskoi politekhniky, 2017. — No 872. — P. 62–68.

Authors: 

Yevheniya Levus, Nadiya Nechypir, Yurii Polyniak

Software Department, Lviv Polytechnic National University,  12, S. Bandery Str., Lviv, 79013, Ukraine

  1. Yevheniia.V.Levus@lpnu.ua,
  2. nadia.nechypir@gmail.com,
  3. yurii.polyniak@gmail.com

Apriori algorithm is analyzed as a search method of associative rules in structured and unstructured data in terms of the number of discovered rules, performance and requirements for computing resources. Unstructured data are closely related to the term ’Big Data’. One of the main tasks of data engineering is the detection of unstructured information processing means. There has been developed a software system to perform computational experiments that processes data using Apriori algorithm, which subject area is trade. Such system can be a prototype for real recommendation system. The software solution is developed on stack of Hadoop technology.

1. Montgomery Karen. Big Data Now: 2014 Edition. O’Reilly Media, Junuary, 2015, 165 p.

2. Maier-Shenberher Viktor, Kuker Kennet. Bolshie dannye. Revoliutsiia, kotoraia izmenit to, kak my zhivem, rabotaem i myslim = Big Data. A Revolution That Will Transform How We Live, Work, and Think, transl. from English I. Haidiuk, M., Mann, Ivanov, Ferber, 2014, 240 p.

3. Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data. John Wiley & Sons. 2014-12-19.300p.

4. Big Data Applience [Electronic resource], Oracle Big Data: sait, Rezhim dostupu https://www.oracle.com/engineered-systems/big-data-appliance/index.html.

5. Almasi, G.S. and A. Gottlieb (2009). Highly Parallel Computing. Benjamin, Cummings publishers, Redwood City, CA, 235 p.

6. Shakhovska N. B. Orhanizatsiia velykykh danykh u rozpodilenomu seredovyshchi, N. B. Shakhovska, Yu. Ya. Boliubash, O. M. Veres// Naukovi pratsi DonNTU. Serie: obchysliuvalna tekhnika ta avtomatyzatsiia, 2014, No 2(27), P. 147–155.

7. Pavych N. Ya. Otsiniuvannia efektyvnosti opratsiuvannia danykh velykykh obsiahiv tekhnolohiiamy Spark ta Hive, N. Ya. Pavych, O. P. Krokhmalna, Visnyk Nats. un-tu "Lviv. politekhnika" "Kompiuterni systemy ta merezhi", 2015, No 830, P. 128–135.

8. Siedushev O. Yu. Metody vydobuvannia danykh z baz nechitkykh znan, O. Yu. Siedushev, Ye. V. Burov, Visnyk Nats. un-tu "Lviv. politekhnika" "Informatsiini systemy ta merezhi", 2014, No 783, P. 193–203.

9. Mapreduce Appliance. [Electronic resource], MapReduce: sait, Rezhim dostupu http://www.teradata.com/products/Aster_MapReduce_Appliance.

10. GreenPlum. [Electronic resource]//: sait, Rezhim dostupu http://www.emc.com/campaign/global/greenplumdca/index.htm.

11. Zhu Yixia, Yao Liwen, Huang Shuiyuan, Huang Longjun. A association rules mining algorithm based on matrix and trees[J]. Computer science. 2006, 33(7):196-198.

12. Tong Qiang, Zhou Yuanchun, Wu Kaichao, Yan Baoping. A quantitative association rules mining algorithm[J]. Computer engineering. 2007.

13. Agrawal R., Imielinski T., Swami A. Mining association rules between sets of items in large database, In Proc. of the 1993 ACM-SIGMOD Int’l Conf. on Management of Data, 1993: 207-216.

14. Agrawal R. and Srikant, R. Fast algorithms for mining association rules. In Proc.20th Int. Conf. Very Large Data Bases, Santiago, Chile, 1994. 487–499.

15. Purdom P. W., Guch D. V., Groth D. P. Avarage case performance of the apriori algorithm – SIAM Journal on Computing, 33(5): 1223–1260, 2004.

16. Mohammed J. Zaki. Scalable algorithms for association mining – IEEE Transactions on Knowledge and Data Engineering, 12(3):373–390, 2000.

17. Brin S., Rajeev Motwani, Ullman J., Tsur S. Dynamic itemset couting and implication rules for market basket data// Proc. ACM SIGMOD Intern. Conference on Management of Data, 255–264 p., USA, 1997.

18. Apache Hadoop. [Electronic resource]// Big Data:sait, Rezhim dostupu https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html

19. Harris, Dereck Intel jettisons its Hadoop distro and puts millions behind Cloudera (27 March 2014).

20. Uait, Tom Hadoop. Podrobnoe rukovodstvo= Hadoop: The Definitive Guide, SPb., 2013, 672 p.

21. Hadoop File System. [Electronic resource]// hadoop-distributed-file-system: sait, Rezhim dostupu https://www. safaribooksonline.com/blog/2013/02/13/the-hadoop-distributed-file-system.

22. White T. Hadoop: The Definitive Guide, 4th Edition. O’Reilly Media, March, 2015 – 756 p.

23. Nechypir N. I. Opratsiuvannia velykykh obsiahiv nestrukturovanykh ta strukturovanykh danykh alhorytmom Apriori, N. I. Nechypir, Ye. V. Levus, Matematychne ta prohramne zabezpechennia intelektualnykh system: mater. XIII Mizhnar. nauk.-prakt. konf, Dnipropetrovsk: Vyd-vo Dnipropetr. Nats. un-tu im. Olesia Honchara, 2015, P. 34–36.