[1]吴继浩. 面向航天产品的多目标动态生产调度方法研究及应用[D]. 绵阳:西南科技大学, 2019.
WU Jihao. Research and Application of Multi-objective Dynamic Production Scheduling Method for Aerospace Products[D]. Mianyang:Southwest University of Science and Technology, 2019.
[2]BANSAL N. Algorithms for Flow Time Scheduling[D]. Pennsylvania:Carnegie Mellon University, 2003.
[3]LEONARDI S, RAZ D. Approximating Total Flow Time on Parallel Machines[J]. Journal of Computer and System Sciences, 2007, 73(6):875-891.
[4]SITTERS R. Efficient Algorithms for Average Completion Time Scheduling[C]∥Integer Programming and Combinatorial Optimization. Lausanne, 2010:411-423.
[5]HALL L A, SHMOYS D B, WEIN J. Scheduling to Minimize Average Completion Time:Off-line and On-line Algorithms[J]. Mathematics of Operations Research, 1996, 22(3):513-544.
[6]MAO H, ALIZADEH M, MENACHE I, et al. Resource Management with Deep Reinforcement Learning[C]∥Proceedings of the 15th ACM Workshop on Hot Topics in Networks. Atlanta, 2016:50-56.
[7]柳丹丹, 龚祝平, 邱磊. 改进遗传算法求解同类并行机优化调度问题[J]. 机械设计与制造, 2020(4):262-265.
LIU Dandan, GONG Zhuping, QIU Lei. Improved Genetic Algorithm for the Optimal Scheduling Problem of Uniform Parallel Machine[J]. Machinery Design &Manufacture, 2020(4):262-265.
[8]许显杨, 陈璐. 考虑设备可靠性与能耗的平行机调度[J]. 上海交通大学学报, 2020, 54(3):247-255.
XU Xianyang, CHEN Lu. Parallel Machine Scheduling Problem Considering Machine Reliability and Enegy Consumption[J]. Journal of Shanghai Jiao Tong University, 2020, 54(3):247-255.
[9]GUPTA D, MARAVELIAS C T, WASSICK J M. From Rescheduling to Online Scheduling[J]. Chemical Engineering Research and Design, 2016, 116:83-97.
[10]ZHANG R, CHANG P, SONG S, et al. A Multi-objective Artificial Bee Colony Algorithm for Parallel Batch-processing Machine Scheduling in Fabric Dyeing Processes[J]. Knowledge-based Systems, 2017, 116:114-129.
[11]MICHAEL L P. Scheduling:Theory, Algorithms, and Systems[M]. New York:Springer, 2018.
[12]TAO J, LIU T. WSPT’s Competitive Performance for Minimizing the Total Weighted Flow Time:from Single to Parallel Machines[J]. Mathematical Problems in Engineering, 2013, 2013:343287.
[13]ANDERSON E J, POTTS C N. Online Scheduling of a Single Machine to Minimize Total Weighted Completion Time[J]. Mathematics of Operations Research, 2004, 29(3):686-697.
[14]TAO J. A Better Online Algorithm for the Parallel Machine Scheduling to Minimize the Total Weighted Completion Time[J]. Computers and Operations Research, 2014, 43(1):215-224.
[15]ABBEEL P, COATES A, QUIGLEY M, et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight[M]∥SCHLKOPF B, PLATT J, HOFMANN T.Advances in Neural Information Processing Systems 19:Proceedings of the 2006 Conference. Cambridge:MIT Press, 2007:1-8.
[16]吴晓光, 刘绍维, 杨磊, 等. 基于深度强化学习的双足机器人斜坡步态控制方法[J]. 自动化学报, 2020, 46:1-12.
WU Xiaoguang, LIU Shaowei, YANG Lei, et al. A Gait Control Method for Biped Robot on Slope Based on Deep Reinforcement Learning[J]. Acta Automatica Sinica, 2020, 46:1-12.
[17]王云鹏, 郭戈. 基于深度强化学习的有轨电车信号优先控制[J]. 自动化学报, 2019, 45(12):2366-2377.
WANG Yunpeng, GUO Ge. Signal Priority Control for Trams Using Deep Reinforcement Learning[J]. Acta Automatica Sinica, 2019, 45(12):2366-2377.
[18]袁兆麟, 何润姿, 姚超, 等. 基于强化学习的浓密机底流浓度在线控制算法[J]. 自动化学报, 2021, 47(7):1558-1571.
YUAN Zhaolin, HE Runzi, YAO Chao, et al. Online Reinforcement Learning Control Algorithm for Concentration of Thickener Underflow[J]. Acta Automatica Sinica, 2021, 47(7):1558-1571.
[19]CUNHA B, MADUREIRA A M, FONSECA B, et al. Deep Reinforcement Learning as a Job Shop Scheduling Solver:a Literature Review[C]∥International Conference on Hybrid Intelligent Systems. Porto, 2018:350-359.
[20]SUTTON R S, BARTO A G. Reinforcement Learning:an Introduction[M]. Cambridge:MIT Press, 2018.
[21]LIU C L, CHANG C C, TSENG C J. Actor-critic Deep Reinforcement Learning for Solving Job Shop Scheduling Problems[J]. IEEE Access, IEEE, 2020, 8:71752-71762.
[22]GABEL T, RIEDMILLER M. Distributed Policy Search Reinforcement Learning for Job-shop Scheduling Tasks[J]. International Journal of Production Research, 2012, 50(1):41-61.
[23]王世进, 孙晟, 周炳海, 等. 基于Q-学习的动态单机调度[J]. 上海交通大学学报, 2007(8):1227-1243.
WANG Shijin, SUN Sheng, ZHOU Binghai, et al. Q-Learning Based Dynamic Single Machine Scheduling[J]. Journal of Shanghai Jiao Tong University, 2007(8):1227-1243.
[24]WANG J, HE J, ZHANG J. A Reinforcement Learning Method to Optimize the Priority of Product for Scheduling the Large-scale Complex Manufacturing Systems[C]∥ 48th International Conference on Computers & Industrial Engineering (CIE48). Auckland, 2018:2-5.
[25]ZHANG Z, ZHENG L, LI N, et al. Minimizing Mean Weighted Tardiness in Unrelated Parallel Machine Scheduling with Reinforcement Learning[J]. Computers & Operations Research, 2012, 39(7):1315-1324.
[26]GUAN Y, REN Y, LI S E, et al. Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization[J]. IEEE Transactions on Vehicular Technology, 2020, 69(11):12597-12608.
[27]WEI H, LIU X, MASHAYEKHY L, et al. Mixed-autonomy Traffic Control with Proximal Policy Optimization[C]∥ IEEE Vehicular Networking Conference(VNC). Los Angeles, 2019:19529967.
[28]GANGAPURWALA S, MITCHELL A, HAVOUTIS I. Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion[J]. IEEE Robotics and Automation Letters, 2020, 5(2):3642-3649.
[29]CHEN Y, MA L. Rocket Powered Landing Guidance Using Proximal Policy Optimization[C]∥4th International Conference on Automation, Control and Robotics Engineering. Shenzhen,2019:1-6.
[30]ZHU J, WANG H, ZHANG T. A Deep Reinforcement Learning Approach to the Flexible Flowshop Scheduling Problem with Makespan Minimization[C]∥2020 IEEE 9th Data Driven Control and Learning Systems Conference. Liuzhou, 2020:20256682.
[31]RUMMUKAINEN H, NURMINEN J K. Practical Reinforcement Learning - Experiences in Lot Scheduling Application[J]. IFAC-PapersOnLine, 2019, 52(13):1415-1420.
[32]SUTTON R S, MCALLESTER D A, SINGH S P, et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation[C]∥ Proceedings of the 12th International Conference on Neural Information Processing Systems. Denver, 1999:1057-1063.
[33]SCHULMAN J, LEVINE S, ABBEEL P, et al. Trust Region Policy Optimization[C]∥32nd International Conference on Machine Learning. Lille, 2015:1889-1897.
[34]MNIH V, BADIA A P, MIRZA M, et al. Asynchronous Methods for Deep Reinforcement Learning[C]∥International Conference on Machine Learning. New York, 2016:1928-1937.
[35]KINGMA D P, BA J. Adam:a Method for Stochastic Optimization[C]∥3rd International Conference for Learning Representations. San Diego,2015:1412.6980.
|