Efficient Deep Reinforcement Learning with Probability Mask in Online 3D Bin Packing Problem

Mar 22, 2023, 4:20 PM
20m
Auditorium (BHSS, Academia Sinica)

Auditorium

BHSS, Academia Sinica

Oral Presentation Track 10: Artificial Intelligence (AI) Artificial Intelligence (AI)

Speaker

Mr Takumi Nakajima (Osaka University)

Description

3D Bin Packing is the problem of finding the best way to pack several cargos into a container in order to maximize the container density. Moreover, some problems have constraints such as weight, stack-ability, fragility, and orientation of cargo pieces. Since the 3D Bin Packing problem is known to be NP-hard, and an exact solution is hard to be obtained in a reasonable time. Therefore, various approximate solution methods have been proposed. We focused on methods using deep reinforcement learning (DRL) to overcome its weakness: its inapplicability to large-scale problems.

In this study, we propose a method to incorporate heuristic computation into the solution of the bin-packing problem using deep reinforcement learning, we propose a method that applies ideas such as the Bottom-Left and Best-Fit methods without searching all the space in the container. The proposed method presents candidate solutions in advance by applying ideas such as the Bottom-Left and Best-Fit methods. Then, a MASK is created with a certainty or binary value that indicating whether the cargo can be placed or not.The MASK is used to narrow the action space by multiplaying it by action probabilities produced by DRL; thereby it leads to improve efficiency of training. This method significantly reduces the search space while maintaining solution accuracy, and is shown to be effective for efficient learning and reduced computational cost.

Through these efforts, we demonstrated the usefulness of using probability distributions with MASK to present candidate solutions using heuristics, and showed the possibility of applying deep reinforcement learning to more complex problems. The proposed learning method improves learning efficiency and achieves performance comparable to that of conventional methods.

In the future, we plan to conduct experiments on the problem of packing cargo of various shapes and materials.

Primary authors

Mr Takumi Nakajima (Osaka University) Prof. Chonho Lee (Okayama University of Science) Prof. Tomohiro Mashita (Osaka University)

Presentation materials

There are no materials yet.