Scholarly article on topic 'Evaluating City Logistics Measure in E-Commerce with Multiagent Systems'

Evaluating City Logistics Measure in E-Commerce with Multiagent Systems Academic research paper on "Economics and business"

CC BY-NC-ND
0
0
Share paper
OECD Field of science
Keywords
{"City logistics" / "urban freight transport" / stakeholder / "multi-agent systems" / Q-learning / "freight vehicle road pricing"}

Abstract of research paper on Economics and business, author of scientific article — Joel S.E. Teo, Eiichi Taniguchi, Ali Gul Qureshi

Abstract This paper presents a multi-agent systems (MAS) model to evaluate City Logistics measure for an urban road network in an e-commerce delivery system environment. Most notable contribution of this evaluation methodology is the combination of vehicle routing and scheduling problem with time window (VRPTW), auction theory and reinforcement learning in a multi-agent framework. This approach seeks to represent the behaviour of each stakeholder involved in the delivery of goods between producers and customers. The preliminary results of the model shows that Government-driven City Logistics measures such as freight vehicle road pricing has the potential of reducing truck emission when the administrator learns and price the road links.

Academic research paper on topic "Evaluating City Logistics Measure in E-Commerce with Multiagent Systems"

Available online at www.sciencedirect.com

SciVerse ScienceDirect

Procedía - Social and Behavioral Sciences 39 (2012) 349 - 359

The Seventh International Conference on City Logistics

Evaluating city logistics measure in e-commerce with multiagent systems

Joel S.E. Teoa*, Eiichi Taniguchia, Ali Gul Qureshia

aDepartment of Urban Management, Kyoto University, Nishikyo-ku, Kyoto, 615-8540, Japan

Abstract

This paper presents a multi-agent systems (MAS) model to evaluate City Logistics measure for an urban road network in an e-commerce delivery system environment. Most notable contribution of this evaluation methodology is the combination of vehicle routing and scheduling problem with time window (VRPTW), auction theory and reinforcement learning in a multi-agent framework. This approach seeks to represent the behaviour of each stakeholder involved in the delivery of goods between producers and customers. The preliminary results of the model shows that Government-driven City Logistics measures such as freight vehicle road pricing has the potential of reducing truck emission when the administrator learns and price the road links.

© 2012 Ppblislied by Elsevier Ltd. Selection and/o r peer-review under re sponsibility of 7th International Conference on City Logistics

Keywords: City logistics; urban freight transport; stakeholder; multi-agent systems; Q-learning; freight vehicle road pricing

1. Introduction

1.1. Research motivation and objectives

The existence of e-commerce has made purchases for consumer products more convenient and sometimes cheaper for customers. Examples of Taobao, under the Alibaba Group in China, venturing overseas and Rakuten of Japan making English the official language within the company explain the drive and successful business strategies (Wikipedia, 2011; Davis, 2010). The nature of such business is also termed as business-to-consumer (B2C) e-commerce and several studies have been done on the link with

* Corresponding author. Tel.: +81-75-3833230; fax: +81-75-9503800. E-mail address: joel.teo@kiban.kuciv.kyoto-u.ac.jp

1877-0428 © 2012 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of 7th International Conference on City Logistics doi:10.1016/j.sbspro.2012.03.113

city logistics and their impact on urban transport (Visser and Hassall, 2006; Esser and Kurte, 2006). One of the purpose of e-commerce is the direct sales from producer to consumer as explained by Visser et al.(2001) and such marketing strategy bypass the traditional retailing channel. It is conservative to mention that e-commerce may not necessarily decrease the need for retail floor space and affect the shopping centres in the city. However, it is rational to say that some functions of retailers may be replaced by producers or distribution centres due to decreasing retail patronage amid the ageing population in some countries. It is noted that e-commerce and home delivery are closely related (Visser et al., 2001). The rise of computer literate aging population in countries like Japan and Singapore increase the possibilities of more home deliveries. Although the consumers are generally concern about e-payment due to security reasons, the emerging technology to overcome security concerns will help to raise the e-commerce trend. There may be less concern on traffic congestion due to a replacement of passenger traffic to freight traffic, but other impacts like pollution from freight traffic may affect the health of the population in the city.

In the research done by Thompson et al.(2001), they constructed a model that can evaluate the potential of regional distribution centres and to predict the increased truck travel distance due to ecommerce that can cause other secondary impacts like vehicle emissions, fuel consumption and truck operating cost. Several studies have been done to identify city logistics measures and they proposed solutions that can be implemented to solve many difficult and complicated problems due to urban freight movement (Taniguchi et al., 2001; OECD, 2003; Russo & Comi, 2010). The measures that apply to city logistics can be categorised into government driven or company driven (Taniguchi and Nemoto, 2003; Anderson et al., 2005). Government driven measures include road pricing, truck ban and regulating cooperative facilities while company driven measures like facility locations and warehouse management can be recommended.

The main purpose of this research is to evaluate the freight vehicle road pricing measure in an ecommerce delivery system environment. The secondary purpose of this research is to explore the potential of previous work of using multi-agent approach (Taniguchi et al., 2006; Tamagawa et al., 2009) and enhance it further to evaluate mainly on the government driven city logistics measures due to the rising trend of e-commerce and home delivery.

1.2. Why MAS approach?

MAS approach is recognised as a useful methodology to consider the multi-objective nature of an urban logistics system and study the behaviour of the stakeholders who are influenced by policy measures (Taniguchi et al., 2010). Our problem considered in this paper fits into the characteristics for MAS approach listed by Parunak (1999) along with several other agent-based approaches to transport logistics papers reviewed by Davidsson et al. (2005). MAS consist of an environment with multiple autonomous agents with the ability to sense, perceive and take action while incorporating the interactions of other agents. Further reading related to MAS can be found in books written by experts in this field (Weiss, 1999; Wooldridge, 2009).

The MAS model described in this paper seeks to follow the model framework for VRPTW with dynamic traffic simulation (Taniguchi et al., 2001), as shown in Fig. 1. VRPTW is used in several studies that require solution for assigning trucks to customers and minimising the cost of delivering goods with a feasible route within the given time window (Thompson and van Duin, 2003). One example of an agent-based approach research, which helps authorities decide on a best location for logistic transit point locations and customer service point, is the use of a combination of VRP models and microscopic approach using AIMSUN in a decision support system (Barceló et al., 2007). Another example of an agent-based model to optimise the location of intermodal freight hubs by considering hub owners,

transport network providers, hub users and communities is presented by (Sirikijpanichkul et al., 2007). There are existing models that predict the impacts of e-commerce (Thompson et al., 2001; Taniguchi et al. 2004) but our proposed model proceeds to include the behaviours of multiple stakeholders, including the government policy makers.

It is known that road pricing is a measure for allocating limited road space. The consequences that lead to congestion reduction and reducing trucks on the roads are not the purpose of road pricing policy but a beneficial result from the initial aim. Government authorities who implemented road pricing have always emphasis that it is not a revenue generating tool (Ogden, 1992). There are researches done to study the expected influence of distance pricing on freight transport in urban areas based on in-depth interviews on carriers (Quak and van Duin, 2010), comprehensive studies on carriers responding to time of day pricing based on data obtained from implemented road pricing scheme (Holguin-Veras et al., 2006), and discussion on the myths and possibility of freight road pricing based on empirical evidence supported with game theoretic analysis (Holguin-Veras, 2010). This paper aims to complement past studies by simulating the e-commerce delivery system using the MAS modelling approach and to evaluate the impact of freight road pricing.

Fig. 1. VRPTW model with dynamic traffic simulation framework

2. Multi-agent model framework

The MAS model, as shown in Fig. 2, goes through the process of receiving the carrier data, customer data and network data to run the insertion heuristics for VRPTW model. Carriers are considered to use

VRPTW based on the fixed demand and time window to bid for the delivery. The MAS model includes the second price auctioning from auction theory (Krishna, 2010), which is closely linked to game theory, to represent the behaviour of the shippers (or producers) in the test road network.

Fig. 2. MAS framework for evaluating freight vehicle road pricing

The typical payoff shown in second-price auction usually seeks for maximum bid while the second-price auctioning in freight delivery aims to reduce the bid as much as possible. In a second price auction for freight delivery, the winning carrier will be paid the amount equal to the second lowest bid. The payoff of the winning carrier is shown as follows:

ITj = [

mirtj^ibj — X; 0

if fj; < mirijxib.

if bi > minj^ibj

where,

tz<_ : Payoff of carrier i

: Bid of carrier i b. : Bid of other carriers, excluding carrier i

The process of competitive pricing benefits all parties with common interest to transfer their job from one to the other at a compromised agreement. In this research, the producers demand delivery service from carriers at a low price while carriers prefer to profit from a higher price. Such conflicts arise and a game theoretic second price auctioning can model this behaviour to represent the interaction between carriers while retailers can obtain the best and truthful price for the delivery service. Typical agents like carriers, shippers, administrators, motorway operators and residents in previous model (Tamagawa et al.,

2009) represent the key stakeholders involved in a logistics platform meeting. In our approach, the model considers the producers as shippers in B2C e-commerce engaging in a negotiation-like process using second-price auction theory. The motorway operators are not considered in this paper as the assumption of road network has no existing highways and has no intention of building additional roads.

Reinforcement learning is used in this model to represent the behaviour of agents during decision making. It has been agreed that intelligent agents should be able to learn and those systems with agents capable of learning can be called intelligent (Sen & Weiss, 1999). As such, this paper starts by treating the administrator as an intelligent agent where he will learn through the experience from his past actions. The process of reinforcement learning includes administrator sensing the environment and taking actions to change the current state to reach his goals (Weiss, 1999). In this paper, only the administrator is assumed to learn after receiving the pollution level from the executed route by the successful bidders or carriers. The reinforcement learning we used is the Q-learning algorithm, first introduced by Watkins (1989) due to its efficiency tested in previous study (Tamagawa et al., 2009). The updating formula in Q-learning for administrator is as follows:

Q (,stJ Of) ^ (1 -a~)Q ist, at ] + H [r^ + ymin Q(st+ v at+j] (2)

where,

: expected NOx level in state t due to action in state t : expected NOx level in state t+1 of all actions : discount rate for administrator (0 < y < 1} : learning rate of administrator (0 < a < 1} : immediate NOx level in state t due to action in state t

The learning rate of 1 represents the administrator, who will consider the most recent information while 0 means the administrator does not learn. Discount rate set at 1 means that the administrator will consider the long term reward while 0 means that the administrator is concern only on current rewards. The oxides of nitrogen (NOx) emission is estimated using equation (3) (NILIM 2003) assuming light delivery vehicles using diesel fuel.

NOx = ¡y(l.06116 + 0',000213i?y - 0.0246ri;

where,

NOx : expected nitrogen oxide emission in grams

£ £j- : length of road link between nodes i and j in kilometres i: speed of vehicle travelling on road link between nodes i and j

3. Results and discussion

The MAS model is used on the test road network as shown in Fig. 3. Four carriers are located at nodes 1, 5, 21 and 25 and they are named as Carriers 1, 5, 21 and 25 respectively. Nodes 8, 12, 13, 14 and 18 are the locations of producers while the rest of the nodes represent the customers. The MAS model is iterated for a year with 360 days due to practicality issues resulting from the assumption of fixed demand,

time window and travel time. The arbitrary pick-up capacity of all the producers is equivalent to the delivery demand of the customers and they are assumed to be fixed during the model run. The randomly assigned time windows set for pick-up from the producers are in the morning while the delivery time windows of the customers are set in the afternoon. The velocity of trucks is assumed to be 30km/h. The model begins by initiating the insertion heuristics VRPTW for pick-up process from producers and is assumed to return to the carrier's depot for repacking before running the next VRPTW for delivery to the customers.

Fig. 3. Test road network

As the objective of the administrator is to lower the NOx level in the city centre, the NOx level of links 8-13, 12-13, 13-14 and 13-18 are summed up to represent the NOx level for the city as shown in Figure 4 and 5 with varying learning rates. Due to our assumption that states that there is no change in the demand, time window, truck capacity, travel time and cost of travel, a carrier will always win in the bid due to its location in the test network when there is no learning and no freight vehicle road pricing after several iterations. Therefore, the optimal route for a winning carrier is assumed to cause the NOx emission in the city to stabilise after many iteration runs, which is represented by the solid black line in Figure 4 and 5. We can observe from Figures 4 and 5 that when the administrator learns, the NOx level in the city can fluctuates to as low as about 2gm. The mean level of NOx without learning is about 25gm while the mean level of varying learning and discount rates are lower as shown in Table 1. We can note the increase of NOx level above 25gm in Figure 4 and 5, which can be explained by the fact that some external links outside the city is priced lower during that particular iteration and the truck traffic chose to travel on the lower cost links in the city. Due to the lower standard deviation level of 9.551 for a = 0.8 and □ = 0.8 shown in Table 1, we used the data sets of this model run to evaluate further on the effectiveness of freight vehicle road pricing.

Given the condition of a static scenario, the only source of dynamic situation is from the action of pricing the road links by the administrator when he learns in each iteration. The cumulative results of NOx level in the city for the entire episode are shown to reduce from the level when the administrator did not learn as presented in Figure 6. These results show that the freight vehicle road pricing has the potential of reducing the NOx level in the city when the administrator starts to learn from his experience and vary the rate of pricing for the road links to maximise his objective. The use of MAS approach allows the administrator to evaluate further on the rate of pricing for each direction of travel on a road link. A typical graph that shows the stabilized pricing rate for each link is shown in Figure 7. This graph shows the results for link 13-14 with the rate for direction from node 14 to 13 at 1103yen and from node 13-14 at 1042yen. As seen from the graph, the pollution level starts to decrease when the pricing rate increase at around day 30. The use of MAS model is not evaluated further for longer than a year as the condition of fixed demand and travel time limits the practicality. As the results are preliminary, more work will be required to include more dynamic scenario to evaluate the effectiveness of freight vehicle road pricing. The MAS model can be further improved by generating dynamic demand and incorporating dynamic travel time using micro-simulation software like VISSIM (PTV, 2008), which will be considered in the future research.

NOx level within city for a year

90807060-

4030 J 2010-0„

L 1 ^egend : Without learning ~^ : a = D-5.J' = (XB learning : a = 0.3, y = 0.3 learning i

* * I . I ■

1 . i i . . Ill 1 .. . i i. .i T

RiiRmfimniifniivrnrr ■■. in inn ' l air im iniifir jr hi i Jiiiir 1 TH IP , ' ' J i'i N. 1 r ' mm iFH "1T1' M| i ,'l iL J,M U, 1.1 LUI fllfTi 11 II "I ir . 1 IL 1

'I'v " "1 nm1 1 rl 1 1

200 Days

Fig. 4. NOx level of city with varying learning values

Table 1. Mean and standard deviation of NOx level with varying learning values

Measurement Learning rate, ö 0.5 0.5 0.8 0.8

Discount rate, Y 0.5 0.8 0.5 0.8

Mean, ft 23.390 23.250 23.796 23.382

Standard deviation, ff 9.872 9.896 10.455 9.551

100 90 80 70 60 50 40 30 20 10 0

NOx level within city for a year

Legend

Without learning h = Q.S, y = 0.3 learning a = L).3,y = L).5 learning

n I 11111' mi1 T" [' r 'llj' "I1 '[

150 200

Fig. 5. NOx level of city with varying learning values

Cumulative IN Ox level within city for a year 150001-1-1-1--r—

Fig. 6. Cumulative NOx level of city links with respective learning parameters

Cumulative NOx level and Average pricing for link 13-14 for a year

Fig. 7. Combine plot of NOx level and average pricing for link 13-14 4. Conclusion

The MAS model explained in this paper incorporates the behaviour of major stakeholders in an ecommerce delivery system. The use of VRPTW for carriers, second-price auctioning for producers in the city centre and Q-learning for administrator seeks to provide an administrator's evaluation tool for freight vehicle road pricing. Q-learning was used to simulate the administrator's unsupervised learning and decision making, which is assumed to be the administrator's behaviour in the real-world situation. In MAS model, the administrator is assumed to observe from the network environment, has a perception and learns from the environment before he sets his next action. This process is repeated until a stop criterion is set in the model and in this case, we stopped the iteration after one year due to the assumption of static scenario with regards to fixed time window, demand and travel time.

The preliminary results from varying the administrator's learning rate and discount rate show the potential of reducing NOx level in the city when the administrator learns and price each link in the test road network based on the outcome of past actions and future rewards. The MAS modelling approach, which assumes the behaviour of the agents, provided the administrator insights of how freight vehicle road pricing can influence the test road network and the environmental impact on each links. Several data can be obtained from this model and one of the most relevant information for administrator is the ability to determine the possible rate of road pricing for freight vehicles to achieve the NOx pollution reduction as shown in the "Results and discussion" section of this paper. The current MAS model seeks to develop beyond the limit of a year's evaluation by improving on the assumption made on fixed demand, travel time and time window. Future research hopes to include dynamic demand and varying travel time obtained from micro-simulation software like VISSIM(PTV, 2008) .

Acknowledgements

The author wishes to thank all Professors and staffs in Logistics Management Systems laboratory, Unit for Liveable Cities and Global COE Program of Kyoto University for their guidance and support.

References

[1] Anderson S, Allen J, Browne M. Urban logistics - How can it meet policy makers' sustainability objectives. Journal of Transport

Geography 2005; 13: 71-81.

[2] Barcelo J, Grzybowska H, Pardo S. Vehicle routing and scheduling models, simulation and city logistics. In: Zeimpekis V,

Tarantilis CD, Giaglis GM, Minis I, editors. Dynamic fleet management: Concepts, systems, algorithms & case studies, New York, USA: SpringerLink; 2007, p. 163-195.

[3] Davidsson P, Henesey L, Ramstedt L, Törnquist J, Wernstedt F. An analysis of agent-based approaches to transport logistics.

Transport Research Part C 2005; 13: 255-271.

[4] Davis A. Taobao, Yahoo Japan team up for e-commerce deal, 2010. Available online at

http://www.cei.asia/searcharticle/2010_05/Taobao-Yahoo-Japan-team-up-for-e-commerce-deal/39861. Retrieved on 14 February, 2011.

[5] Esser K, Kurte J. B2C e-commerce: Impact on transport in urban areas. In: Taniguchi E, Thompson RG, editors. Recent

advances in city logistics: Proceedings of the 4th International Conference on City Logistics. Amsterdam, The Netherlands: Elsevier; 2006, p. 437-448.

[6] Holguin-Veras J. The truth, the myths and the possible in freight road pricing in congested urban areas. The Sixth International

Conference on City Logistics, Puerto Vallarta, Mexico. Elsevier; 2010, p.6366-6377.

[7] Holguin-Veras J, Wang Q, Xu N, Ozba K, Cetin M, Polimeni J. The impacts of time of day pricing on the behaviour of freight

carriers in a congested urban area: Implications to road pricing. Transportation Research Part A 2006; 40: 744-766.

[8] Krishna V. Auction Theory 2nd edition. London, UK: Elsevier; 2010.

[9] NILIM. Qualitative appraisal index calculations used for basic unit computation of CO2, NOx, SPM, 2003 (in Japanese).

[10] OECD. Delivering the goods - 21st century challenges to urban goods transport. USA: Organisation for Economic Cooperation and Development; 2003.

[11] Ogden KW. Urban goods movement: A guide to policy and planning. USA: Ashgate Publishing Company; 1992.

[12] Parunak H. Industrial and practical applications of DAI. In: Weiss G, editor. Multiagent systems. Cambridge: MIT Press; 1999.

[13] PTV A . VISSIM 5.10 User Manual. Karlsruhe, Baden-Württemberg, Germany; 2008.

[14] Quak H, Van Duin J. The influence of road pricing on physical distribution in urban areas. The Sixth International Conference of City Logistics, Puerto Vallarta, Mexico. Elsevier; 2010, p. 6141-6153.

[15] Russo F, Comi A. A classification of city logistics measures and connected impacts. The Sixth International Conference on City Logistics. Puerto Vallarta, Mexico. Elsevier; 2010, p. 6355-6365.

[16] Sen S, Weiss G. Learning in multiagent systems. In: Weiss G, editor. Multiagent systems: A modern approach to distributed artificial intelligence, Cambridge, Massachusetts: The MIT Press; 1999, p. 259-298.

[17] Sirikijpanichkul A, Van Dam K, Ferreira L, Lukszo Z. Optimizing the location of intermodal freight hubs: An overview of the agent based modelling approach. J Transpn Sys Eng & IT 2007; 7(4): 71-81.

[18] Tamagawa D, Taniguchi E, Yamada T. Evaluating city logistics measures using a multi-agent model, 2009.

[19] Taniguchi E, Kakimoto Y. Modelling effects of e-commerce on urban freight transport. In: Taniguchi E, Thompson RG, editors. Logistics systems for sustainable cities, Oxford, UK: Elsevier; 2004, p. 135-146.

[20] Taniguchi E, Nemoto T. Transport-demand management for freight transport. In: Taniguchi E, Thompson RG, editors. Innovations in freight transport. Southampton, UK: WIT Press; 2003, p. 101-124.

[21] Taniguchi E, Thompson RG, Yamada T. Incorporating risks in City Logistics. The Sixth International Conference on City Logistics, Puerto Vallarta, Mexico. Elsevier; 2010, p. 5899-5910.

[22] Taniguchi E, Thompson RG, Yamada T, Van Duin R. City logistics: Network modelling and intelligent transport systems. Netherlands: Elsevier Science Ltd; 2001.

[23] Taniguchi E, Yamada T, Okamoto M. Multi-agent modelling for evaluating dynamic vehicle routing and scheduling systems. Proceedings of the Eastern Asia Society for Transportation Studies 2006; 6.

[24] Thompson RG, Chiang C, Jeevaptsa M. Modelling the effects of e-commerce. In: Taniguchi E, Thompson RG, editors. City Logistics II, Kyoto, Japan: Institute of Systems Science Research; 2001, p. 99-110.

[25] Thompson R, Van Duin R. Vehicle routing and scheduling. In: Taniguchi E, Thompson RG, editors. Innovations in freight transport, UK: WIT Press; 2003, p. 47-63.

[26] Visser J, Hassall K. The future of city logistics: Estimating the feasibility of home delivery in urban areas. In: Taniguchi E, Thompson RG, editors. Recent advances in city logistics: Proceedings of the 4th International Conference on City Logistics, Amsterdam, The Netherlands: Elsevier; 2006, p. 147-161.

[27] Visser J, Nemoto T, Boerkamps J. E-commerce and city logistics. In: Taniguchi E, Thompson RG, editors. City Logistics II, Kyoto, Japan: Institute of Systems Science Research; 2001, p. 35-66.

[28] Watkins C. Learning from delayed rewards. England: PhD Thesis, University of Cambridge; 1989.

[29] Weiss G. Multiagent systems: A modern approach to distributed artificial intelligence. Cambridge, Massachusetts: The MIT Press; 1999.

[30] Wikipedia. Rakuten: http://en.wikipedia.org/wiki/Rakuten, 3 February, 2011. Retrieved on 14 February, 2011.

[31] Wooldridge M. An introduction to multiagent systems - 2nd edition. West Sussex, United Kingdom : John Wiley & Sons; 2009.