Ordered Access Memory and Its Application in Parallel Processors

2017;
: pp. 54 - 62
Authors:
1
Lviv Polytechnic National University, Computer Engineering Department

In this paper, after analyzing the known memory access methods, conventional memory organization and its challenging problems, we propose new ordered memory access method and a new type of memory – the ordered access memory. This method is aimed at working with data arrays and provides memory access in the prescribed manner. Proposed method unlike widely used method of sequential memory access allows extending the functionality of the memory as it provides not only sequential, but also any other ordered memory access. Unlike another widely used method of address memory access, the implementation of the proposed method provides parallel conflict-free memory access. It also allows eliminating data binding to a specific memory location that makes it possible to disintegrate the apparatus for data ordering and eliminates the need to store addresses of locations the data are placed in, and the need to submit the address to the address inputs during data writing and reading.

The new method distinctive features compared to the known memory access methods are considered. Input data, their indices and output data of the ordered access memory are described as well as the approaches to this type of memory design and use. The interface of the ordered access memory is considered as well as its advances compared to the random, associative, and sequential access memories. An example of the ordered access memory usage in application-specific processors with parallel and pipeline structures is demonstrated and the results of the ordered access memory implementation in FPGA are considered.

[1] National Research Council. The Future of Computing Performance: Game Over or Next. Level? The National Academies Press. 2011.

[2] 21st Century Computer Architecture A community white paper May 25, 2012 http://www.cra.org/ccc/files/docs/init/21stcenturyarchitecturewhitepaper...

[3] Richard C. Murphy. On the Effects of Memory Latency and Bandwidth on Supercomputer Application Performance. In IEEE International Symposium on Workload Characterization 2007 (IISWC2007), September 27–29, 2007

[4] Melnyk, A., Computer architecture, Lutsk regional printing. Lutsk, 2008.

[5] A. Melnyk. Buffer memory device. USSR patent No. 1479954, issued at 1989.

[6] A. Melnyk. Sorting memory devices for digital signal processing systems. 1-th Ukrainian conference “Signal processing and image recognition”. Kyiv, 17–21 of November 1992. – pp. 187–188.

[7] A. Melnyk. Design principles of buffer sorting memory // Proceedings “Computer engineering and information technologies”. Lviv Polytechnic State University, 1996. – No. 307. – pp. 65–71.

[8] A. Melnyk. Real-time application-specific computer systems. Lviv Polytechnic State University, 1996. – 60 P.

[9] A. Melnyk. Structure organization of ordered access memory based on the tunable sorting networks. // Informatics and computing technique. University “Ukraine”, 2011, pp. 34–46.

[10] A. Melnyk. Ordered access memory. – Lviv Polytechnic Publishing House, 2014. – 330 p.

[11] Patterson, D. and Hennessy, J., Computer Architecture. A Quantitative Approach, Morgan Kaufmann Publishers Inc., 1996.

[12] Stallings, W., Computer Organization and Architecture, Pearson, 10th ed., 2016.

[13] Tanenbaum, A., Structured Computer Organization, 6th ed., Pearson, 2013.

[14] Bruce Jacob. Memory Systems: Cache, DRAM, Disk / Bruce Jacob, Spencer Ng, David Wang, Morgan Kaufmann Series in Computer Architecture and Design, 2007.

[15] V. Cuppu, B. Jacob, B. Davis, and T. Mudge. High-performance drams in workstation environments. IEEE Transaction on Computer, 50(11):1133–1153, 2001.

[16] J. Shao and B. T. Davis. A burst scheduling access reordering mechanism. In HPCA '07: 13 th International Symposium on High-Performance Computer Architecture, Phoenix, AZ, USA, Februaru 10–14, 2007.

[17] Jingtong Hu, Chun Jason Xue, Wei-Che Tseng, Meikang Qiu, Yingchao Zhao, Edwin H.-M. Sha. Minimizing Memory Access Schedule for Memories. The Fifteenth International Conference on Parallel and Distributed Systems (ICPADS'09), 2009.

[18] Jingtong Hu, Chun Jason Xue, Wei-Che Tseng, Qingfeng Zhuge, Yingchao Zhao, Edwin H.-M. Sha. Memory Access Schedule Minimization for Embedded Systems. Journal of Systems Architecture: Embedded Software Design (JSA), Oct. 2011.

[19] Knuth D.E. The Art of Computer Programming. Volume 3: Sorting and Searching. 2nd edn. Addison-Wesley. 1998.

[20] H. S. Stone. Parallel Processing with the Perfect Shuffle, IEEE Transactions on Computers, Vol. 20, pp. 153–161, 1971.

[21] G. M. Masson, G. C. Gingher and S. Nakamura. A Sampler of Circuit Switching Networks. EEE Computer, Vol. 12, No. 6, pp. 32–47, June 1979.

[22] D. Nassimi and S. Sahni. A Self-Routing Benes Network and Parallel Permutation Algorithms. IEEE Transactions on Computers, Vol. 30, pp. 332–340, 1981.

[23] K. E. Batcher. Sorting Networks and Their Applications. Proc. AFIPS Spring Joint Computer Conf. 32, pp. 307–314, 1968.

[24] C. D. Thompson and H. T. Kung. Sorting on a Mesh-Connected Parallel Computer. Comm. ACM, Vol. 20, pp. 263–271, 1977.

[25] Rene Mueller, Jens Teubner, Gustavo Alonso. Sorting Networks on FPGAs. The VLDB Journal, Vol. 21, No. 1, p. 1–23, February 2012.

[26] Y.Jun, Li Na, D. Jun, Guo Y., Tang Z. A research of high-speed Batcher's odd-even merging network. E-Health Networking, Digital Ecosystems and Technologies (EDT), Vol. 1, pp. 77–80, April 2010.

[27] Jean-Philippe Thiran, Herve Bourlard, Ferran Marques, Multi-Modal signal processing: methods and techniques to build multimodal interactive systems. Academic Press Inc. 23 November 2009. – 448 p.

[28] F. Camastra and A. Vinciarelli. Machine Learning for Audio, Image and Video Analysis: Theory and Applications. Springer, 2008.

[29] Handbook of Signal Processing Systems. Editors: Shuvra S. Bhattacharyya, Ed F. Deprettere, Rainer Leupers, Jarmo Takala. – Springer, 2010. – 1117 p.

[30] Melnyk A., Melnyk V. “Personal Supercomputers: Architecture, Design, Application”. – Lviv Polytechnic Publishing House, 2013. – 516 p.

[31] J.Leverich Comparative Evaluation of Memory Models for Chip Multiprocessors/ J. Leverich, H. Arakida, A. Solomatnikov, A. Firoozshahian, M. Horowitz, C. Kozyrakis // ACM Transactions on Architecture and Code Optimization. – November 2008.