Key Words: Robot, Path Planning,Reignforced Learnig, Rewards. 1 Preliminary survey of cleaning in the bus. Fig.2 Areas under bus seats

Study of Environmental Stracure Cleaner Robot Using for Efficient to Emerge Human Substantial Ability -Design of Path Planning for Cleaning Robot Adapted to Biased Dirt Distribution- Zhang JINKAI, Graduate School of Engineering, Hokkaido University of Science, 975@hus.ac.jp Satoshi TAKEZAWA, Hokkaido University of Science Masao NAGAMATSU, Hokkaido University of Science Akihiko TAKASHIMA, Hokkaido University of Science Due to the development of information processing technology, mechanization of robotic technology and automatization of household chores or building maintenance, our living quality has been dramatically improved. It is expected that introduction of cleaner robots to indoor is more rapidly proceed than ever by this trend. An automatic cleaning robot that runs around indoors and performs cleaning automatically, is representative as housework robots. Hence, in this paper of research, we would like to carry out a new path planning algorithm for a cleaning robot, which will efficiently performs cleaning by assigning priorities to the targeting based on the deviation of the distribution of dirt and selected cleaning targets. Key Words: Robot, Path Planning,Reignforced Learnig, Rewards Preliminary survey of cleaning in the bus The project targets are generally presented as four operations: regular cleaning, major cleaning, sight-seeing bus cleaning and tour bus cleaning. According to the vehicle cleaning work standard table (extracted from Kagoshima City Transport Bureau material), it is shown in Table below. lr Operations Places Cleaning Methods Regular Cleaning Major Cleaning Sight-seeing bus Cleaning Tour Cleaning bus Floor, Floor, Floor Sweep out dust and sands with a broom and wipe it off with a wellsqueezed mop Same as above method with a hard-squeezed mop Sweep out dust and sands with a broom and wipe it off with a mop (use the mop after washing and dry up) Remarks Cleaning the places with the mop after washing it Same above Under seat,aislel, driving seat as [ ] Mop is located at the dock of 2 Problems of cleaning inside of the bus After arriving at the bus terminal after operating hours, there is not much time to spend on cleaning up by the next day operating, which is why there are reasons to rely on mechanical maintenance such as robots. Possible problems are visualized as follows. a) Fig shows depending on the climate changes, the inside of the bus becomes a malicious environment and the load on the cleaning robot is also at large scales. Even though Fig. Layout of Bus seats Y [m] 2.5 2.5.5 Fig.3 Layout of Bus seats Fig.2 Areas under bus seats.25 2.5 3.75 4. 5.25 6. 7.25 8.5 9.75. X [m] Fig.4 Areas under bus seats

there is beyond dangerous speculations, the request may be on the landmine removal level. b) b)a mechanism is needed to automatically clean up dirt, such as dross, sand, gum, and vomits. c) Fig2 shows Maintenance cleaning under the bus seats is required. Various problems due to Hokkaido unique weather environment in the winter (polluted water / sands mixed with snow melting agent) are also may be assumed. 3 Current exiting robots Current available robots in the market are listed as follows, the problems are seemed to be facing also summarized below. Braava, the floor-wiping robot for domestic chores is equipped with navigation system. Recommendations: a) Quiet and excellent in floor-wiping, cleaning towards the floor sheet rather than suction. b) Since garbage dumping is only required for exchanging sheets, it is simple and clean. c) The main body is small, colour in white and stylish. Standing storable and space-saving. Braava has no automatic charging function, so that it will not be left unattended with wet sheets attached during water wiping. This is designed for avoiding electrical short safety. Other models, for suction type, Rumba (USA: US / i Robot Corporation: I have using experience and is still in use. However, Rumba 98 is expensive, followed by Rumba 6). Japan domestic brand machine, Sharp cocobo RX-V6 is also very competitive. In some cases, persistent dirt countermeasures and self-pre-cleaning are essential before using. For more details, please refer to the following URL: http://dyson-twinbird.seesaa.net/article /225525436.html 4 Design specifications for robotization Regarding the possibility of robotization, here are some confirmations have to be clarified that the following items and the specifications of table below. a) What are the problems of current cleaning robots? b) For the cleaning, is it the whole space inside of the bus? c) Which is better serving one cleaning robot for one bus, or serval robots for one bus? d) Is there a time limitation for cleaning process? e) What are the considerations for future commercialization? f) When is the deadline for inventions and modifications of the cleaning robot? 5 Theorem Due to the development of information processing technology, mechanization of robotic technology and automatization of household chores, our living quality has been dramatically improved. It is expected that introduction of household robots to general families is more rapidly proceed than ever by this trend. An automatic cleaning robot that runs around indoors and performs cleaning automatically, is representative as housework robots. In the previous models, the behaviour of the automatic cleaning robot was focus on cleaning the floor by random movements, but in recent years, there have been more types of efficiently cleaning process by systematically programmed routes. Researches on route planning of automatic cleaning robots have been actively conducted, and various route planning methods have been proposed so far. For Table Specifications required for cleaning robots Specification Requirement Specification Requirement countermeasures Autonomous traveling Frequency of usage Weight limit Within 5 Kg Yes No Mud Water Yes NO Shape Rechargeable? Yes No Power supply system Daily? Period of usage During night? Disctype Material Drive system Motor Colour Countermearsures Yes No Waterproof Yes No for noise example, Choset [] proposes a method of dividing a cleaning target into partial s based on obstacles and sequentially cleaning onto the partial s. There are many other path planning methods using region segmentation, and used genetic algorithm, Chibin et al. [2] was using ant-colonyoptimization to efficiently determine the order of passage between partial regions Zhongmin et al [3] and others. There are also many other methods that do not perform region segmentation, such as using Luo et al [4] neural networks, and some is using Young et al [5] Distance Transform. The entire path planning methods is reported up to now; including the methods mentioned above, are focused on how efficient the whole will be cleaned. However, assuming the actual environment, the distribution of dirt is not equal; it also has a characteristic bias due to the shape of the region and the usage of personal behaviour of the cleaning robot owner. Therefore, rather than uniformly cleaning the entire, it is considered as more efficient to clean certain concentrating s where the amount of dirt accumulate-bias. Hence, in this paper of research, we would like to carry out a new route planning algorithm for a cleaning robot, which will efficiently performs cleaning by assigning priorities to the targeting based on the deviation of the distribution of dirt and selected cleaning targets. 5. Algorithm of Route planning based on dirt distribution In this proposal, firstly, the cleaning target is divided into partial regions based on the distribution of contamination, the order of passage of each divided partial region could be obtained, and the movement of route in the partial region is due to the pre-determined cleaning route. The needs to be cleaned shall be represented by a grid cell of the same size as the cleaning robot. 5.. Region segmentation In the proposal of region segmentation, is using the additive weighted on discrete Voronoi diagram [6], the k-means method [7], Newton s laws, and the total amount of contamination in each partial region. The target region is divided so that the amount of difference becomes small. Therefore, a small partial is densely generated in a portion where the density of contamination is high, and a large partial is sparsely generated in a portion where density of contamination is low.

In this situation where there is contamination equal to all the partial regions, a small partial region can be reached out with less cost of movement. Therefore, by introducing the region segmentation method as described above, it is possible to obtain the passing order to be more efficiently to collect dirt in partial regions. An additive weighted discrete Voronoi diagram is a type of division constituting with boundary lines between adjacent of generating points, so that the discretized space is added with a weight added to the distance from the generating point. This law of equation, the set V (a i) of he cells x belonging to the generation point (a i ) is given by the equation () below. V (a i ) = {x d(x, a i ) w i d(x, a j ) w j ; j i} () Here, d is the distance from the center of the cell to the generating point, and w i is the weight depending on the total amount of contamination of the cell, which is belonging to the generating point ai. In region segmentation of this proposed method, first, generating in the target region at random manners, applying equality weighting to each generating point, Voronoi diagram is applied to generate partial regions. Next, the k-means method is applied to the generating points ai of each partial region, Newton s law is respected to the weight a i repeatedly, and therefore, the region division is optimized so that the difference of the total amount of contamination of each partial region becomes small. The k-means method is a non-hierarchical clustering algorithm, and here is a solution to the optimization problem like equation (2) below. arg min a a k ;a k n i= min j x i a j (2) In this proposal method, each cell x i is followed as a clustering target data, each generating points of an additive weighted discrete Voronoi diagram is also as the cluster center. 5..2 Determination of passage orders The order of passage of the partial is determined by the distance from the center of gravity G i, one representative point is connected to the other representative point, the adjacent partial region. These two representative points are placed at the center of gravity G i. In the case of corresponding for passing through the partial which needs to be cleaned up, a proposal algorithm, Piwonska et Selective traveling salesmen using genetic algorithm according to al. [8] with reference provided to this algorithm, we determined the order of passage between partial s. The cost of moving from the representative point i to the representative point j is d ij, Assuming that the total amount of dirt is w i and the movement cost limit is C. Travel cost d ij is a representative point corresponding to the case where the representative point j passes through without cleaning, it is necessary to move between the representative points i and j by the shortest route. It was the cost of representative point corresponding to the case where the representative point j needs to be cleaned; cost is taken in order to move the entire partial to which the representative point j belongs. Furthermore, the travelling restriction of cost, C sets the passing order as r in terms of total travel cost l(r), the total transfer of the cleaning robot in searching for the passing order so that the moving cost l(r) does not exceed C. Genetic determination for the order of passage between partial s using algorithms is based on initial generation solution; it consists of three stages of calculations, genetic manipulation, random generation and repeat genetic manipulation. By doing so, the passing order of partial s is brought closer to the optimal solution. 5..3 Determination of route Determination of route within sub regions was based on the Distance Transform method from Zelinsky et al. [9]. How the robot moves within i partial is determined by using the Distance Transform method with the cell adjacent to the (i + ) partial as the generation point among the cells in the i partial. However, when considering the route in the last partial among the paths between the partial s, the starting point is set as the generating point. Simulation experiments were conducted to confirm whether the proposed algorithm can determine the route that can efficiently collect dirt by taking contamination bias into account. 6 Determination of partial regions of passing order by Dijkstra Algorithm Determination of partial regions of passing order, is arranged by using the Dijkstra algorithm method, which the centre of ever partial regions with two dots are adjancency relationship betwween sub-regions,connected as a network, shown below by Fig.5. The two dots are repersented as the Centroid G i,corresponding to the partial regions which the robot is passing through with cleaning and without cleaning. The Fig.5 Image of Dijkstra method proposal of this algorithm, determination of partial regions of passing order, is based on the reference of Dijkstra Algorithm. d ij is repersented as the moving cost from dot i to dot j, the total amount of dirt is repersented as w i and the limitation of moving cost is C. The moving cost, d ij, is when j a representative point corresponding to the cleaning path, and the cost is required to move between the representative points i and j with the shortest path. The entire of the partial region, which the representative point j is to be moved,it is assumed to be the moving cost. C is represented as the limitation of moving cost,and r is repersentative of passing orders when l(r) is under the total moving cost condition, the total moving cost l(r) of the cleaning robot is following the passing order to explore within the limitation of C. By repeatedly searching for routes using the degree of adaptation for randomly generated initial solutions, this method could bring the order of the partial regions closer to the optimal solution. 7 Route planning by Dijkstra algorithm 7. Features of Dijkstra algorithm Features of the Dijkstra algortithm are summarized below Edges with negative cost (distance or time) can not be carried out The shortest distance from a specific node and the routes are found for all other nodes Brief summary of the algorithm, shown as follows. Initialization: The value of the start node (minimum cost candidate) Undefined values of other nodes or is set to 2. Can not pick up until any other definite nodes =Until there is not changing Repeating the loop.nodes that still have not yet been confirmed Find the node with the smallest value and make it a confirmed one Set a confirmating flag

3. Each of the extending edges from the confirmed node has been checked The cost of the definite node + the cost of the edge has been calculated out It will be updated once it is smaller than the current value of the node. The variables will indicate where did it came from point to the confirmed node, if the information of route is necessary. In Dijkstra algorithm, an edge is referenced around with a node, If the information of an edge is stored in the connection source,it becomes easy to access 7.. Initial solution of generation At the stage of initial solution of generation, this part will be the starting point of the robot Randomly selecting of the next moving candidate, from the partial is adjacent to other s Next, the moving cost to the representative point of the moving candidate has been gathered Costl(r)is implemented At this point of time Total moving costl(r) C 2 moves to the next representative point which has been selected perious, and the same operation has been performed from the representative point to the destination. However, the deleted representative point which was before the moving is also removed from the candidate of the next moving destination. This method is to prevent repeatedly moving between the same partial s Total moving costl(r)is C 2 becoming larger next representative point of candidates has been kepy still through exactly the same route as the previous route has been returned to the starting point As a result, the total moving cost of all the initial solutions becomes a route that falls below under the moving cost limitation C This kind of operation is rsizerepeatedlyrsizenumbering to the initial solutions Perform the generated initial solution to the first reward of the solution into adding Q-Learning to Dijkstra s althorithm. Next, this randomly generatedrsizetypes of initial solutions, Formula(3) Formula(4)calculating the fitness of the solution by using the evaluation function { f(r) ifl(r) C eval(r) = (3) p(r) = f(r)p(r) ( l(r) C C ifc l(r) ) This has been sought. Here, f(r)is the passsing order of rthe total amount of dirt removed during cleaning process. p(r)is total moving costl(r)is the moving cost that has been overtaing C CThis is a penalty function to be imposed as much as exceeded. Adaptability is a value that indicating how much the solution is suitable for the environment. In this case of the proposed algorithm formula(6)is less than total moving costl(r)huge amount of dirtf(r)is suited to the solution of being collected at a high adaptability. 8 Rescorla-Wagner model In this research, we introduce classical attached some conditions. One is the dirt and other is the no-dirt. The dirt is valiable V for conditional stimulus (CS).To repeat trying, the V is changing. Therefore, union happens during stimulation between conditional stimulus and non conditional stimulus. A representative equation is Rescorla-Wagner model as follows: (4) V (t + ) = V (t) + α (R(t) V (t)) (5) Here the equation(5)is related with R(t) V (t), With the occure of perdiction error of compensation ere is the modification of (t), model has been evaluated as two seperating formulas. δ(t) = R(t) V (t) (6) V (t + ) = V (t) + α δ(t) (7).8.6.4.2.8.6.4.2 -.2 5 5 2 25 3 35 4 45 5 P(a=) 5 5 2 25 3 35 4 45 5 Fig.6 Expected reward V.8.6.4.2 -.2 -.4 -.6 -.8 5 5 2 25 3 35 4 45 5.8.6.4.2 -.2 -.4 -.6 -.8 Fig.8 α =. - 5 5 2 25 3 35 4 45 5 Fig. α =.4.8.6.4.2 -.2 -.4 -.6 -.8-5 5 2 25 3 35 4 45 5 Fig.4 α =.7.8.6.4.2 -.2 -.4 -.6 -.8 Q Q 2.8.6.4.2.8.6.4.2 -.2-5 5 2 25 3 35 4 45 5.8.6.4.2 -.2 -.4 -.6 -.8 Fig.9 α =.2-5 5 2 25 3 35 4 45 5 Fig.2 α =.5.8.6.4.2 -.2 -.4 -.6 -.8-5 5 2 25 3 35 4 45 5 Fig.5 α =.8 2 3 4 5 6 7 8 9 P(a=) 2 3 4 5 6 7 8 9 Fig.7 Reward R.8.6.4.2 -.2 -.4 -.6 -.8-5 5 2 25 3 35 4 45 5 Fig. α =.3.8.6.4.2 -.2 -.4 -.6 -.8-5 5 2 25 3 35 4 45 5 Fig.3 α =.6.8.6.4.2 -.2 -.4 -.6 -.8-5 5 2 25 3 35 4 45 5 Fig.6 α =.9 Rescorla-Wagner model is carried out with the cleaning robot. Simulation has been carried out. % Trail numbers T = 5; % Learning rate alpha =.; % Initial value of V V() = ; % Parametre of Thermodynamic beta beta = 3; Next, the trail numbers has been set. Here, 8 of correctness of compensation has been set as (R = ) he rest of 2 of compensation has been set as (R = ). Fig.?? and Fig.?? are the results of learning, horizontal axis is the times of trials, vertical axis is the midside of expecting values of compensation V and compensation R are plotted. Furthermore, Fig.6 to Fig.?? shows that a low learning rate like. is suggested to cautiously and stably good achievement. 9 Discussion We confirmed high picking rate for dirties independent stabilities for dirt distributions with proposal algorithm. We concluded our proposal algorithm is suitable for path planning. Q Q 2

We consider to improve performance for picking dirties if the loss is dissipated to the selecting path in the partial or path priorities. [] C, vol.64-626, pp.3854 386, 998. [2] Shinjuku, D., Shibuya, J. and Tokyo, M., Swing Motion Control of Casting Manipulation, IEEE Control Systems, vol.9-4, pp.56 64, 999.