中国机械工程 ›› 2023, Vol. 34 ›› Issue (21): 2600-2606,2614.DOI: 10.3969/j.issn.1004-132X.2023.21.009

• 智能制造 • 上一篇    下一篇

基于复合规则和强化学习的混流装配线调度方法

郭具涛1,2;吕佑龙3;戴铮1;张洁3;郭宇2   

  1. 1.上海航天精密机械研究所,上海,201600
    2.南京航空航天大学机电学院,南京,210016
    3.东华大学人工智能研究院,上海,201620
  • 出版日期:2023-11-10 发布日期:2023-11-29
  • 作者简介:郭具涛,男,1988年生,高级工程师、博士研究生。研究方向为数字化集成制造。E-mail:guojutao800@163.com。
  • 基金资助:
    装发快速支持项目(JZX7Y20220163200201);科技创新行动计划启明星项目(ZZQB14042000)

Compound Rules and Reinforcement Learning Based Scheduling Method for Mixed Model Assembly Lines

GUO Jutao1,2;LYU Youlong3;DAI Zheng1;ZHANG Jie3;GUO Yu2   

  1. 1.Shanghai Spaceflight Precision Machinery Institute,Shanghai,201600
    2.School of Mechanical and Electrical Engineering,Nanjing University of Aeronautics and
    Astronautics,Nanjing,210016
    3.Institute of Artificial Intelligence,Donghua University,Shanghai,201620
  • Online:2023-11-10 Published:2023-11-29

摘要: 针对混流装配线的平衡与排序问题,提出了一种基于复合规则和强化学习的智能调度方法。根据数学模型,设计了平衡规则库与排序规则库,提出了规则加权组合的近端策略优化(PPO)算法,并利用具有Actor-Critic训练流程和优先经验回放机制的强化学习过程,实现了复合规则权值参数的调控优化,生成了平衡与排序方案。所提方法与PPO+单一规则算法、复合规则和遗传算法的对比实验验证了所提方法的有效性。

关键词: 混流装配线, 平衡与排序, 深度强化学习, 复合规则, 集成优化

Abstract:  A scheduling method was proposed based on compound rules and reinforcement learning for balancing and sequencing problems of mixed model assembly lines. A balancing rule set and a sequencing rule set were designed with the consideration of mathematical model, and a proximal policy optimization(PPO) algorithm featured with Actor-Critic training procedure and preferential experience learning mechanism was employed to regulate weighted parameters of these rules, in order to generate reasonable balancing and sequencing solutions. In comparative experiments, the proposed scheduling method demonstrates the effectiveness over other methods including PPO algorithm with single rule, compound rules, and a genetic algorithm.

Key words: mixed model assembly line, balancing and sequencing, deep reinforcement learning, compound rule, integrated optimization

中图分类号: