期刊名称:Review of the National Center for Digitization
印刷版ISSN:1820-0109
出版年度:2017
卷号:31
页码:30-39
语种:English
出版社:Faculty of Mathematics
摘要:Distributed computing implies presence of unused software resources available on multiple computersthat work as a single system. This kind of computing uses a system with parallel architecture and varying nodereliability. As a consequence, an adequate programming paradigm has to be used. Web application, described inthis paper, is designed with such paradigm in mind. It is developed using popular technologies. Proposedapproach can attract two types of users: ones that need additional computing resources (in further text seekers)and ones that are willing to contribute by putting their computing resources on disposal (in further text helpers).Seeker is obligated to share their data which is then divided into equal segments. Number of these equalsegments is defined by seeker in advance. Secondly, seeker has to define processing procedure, i.e. code forprocessing these segments separately. Eventually, they should define the way how processed segments arereduced into final result. Described programming paradigm is known as MapReduce. Data can be in arbitraryformat (at the moment, the system is evaluated for text and images) as long as the map-function handles it in theappropriate way. Helper is assigned a segment of the input data. Map-function, defined by the seeker, is thenexecuted within helper’s Web browser and its result is being returned to the system when processing procedurefinished. The Web application’s efficiency depends on the number and configuration of computing nodes. Fourdifferent use-cases are demonstrated in this paper: 1) word counting in file containing text, 2) finding the largestnumber in the text file that contains numbers, 3) sharpening of the corrupted image and 4) applying blur effecton the image file. Since its simplicity and universality, the system has potential for other more complexcomputations and could, in the future, be applied in the domain of distributed content digitalization, analysis ofthe data obtained from telescopes etc.