Hostname: page-component-7b9c58cd5d-hpxsc Total loading time: 0 Render date: 2025-03-15T10:44:27.158Z Has data issue: false hasContentIssue false

Enhancing the function of the aids to navigation by practical usage of the deep learning algorithm

Published online by Cambridge University Press:  13 January 2025

Yoontae Sim*
Affiliation:
Division of Navigation safety system, Mopko Maritime University, Mokpo, Korea
Chong-Ju Chae
Affiliation:
Maritime Safety & Environmental Administration (MSEA), World Maritime University, Malmo, Sweden
*
*Corresponding author: Yoontae Sim; Email: 12188@naver.com
Rights & Permissions [Opens in a new window]

Abstract

Information is provided to navigators through advanced onboard navigation equipment, such as the electronic chart display and information system (ECDIS), radar and the automatic identification system (AIS). However, maritime accidents still occur, especially in coastal and inland water where many navigational dangers exist. The recent artificial intelligence (AI) technology is actively applied in navigation fields, such as collision avoidance and ship detection. However, utilising the aids to navigation (AtoN) system requires more engagement and further exploration. The AtoN system provides critical navigation information by marking the navigation hazards, such as shallow water areas and wrecks, and visually marking narrow passageways. The prime function of the AtoN can be enhanced by applying AI technology, particularly deep learning technology. With the help of this technology, an algorithm could be constructed to detect AtoN in coastal and inland waters and utilise the detected AtoN to create a safety function to supplement watchkeepers using recent navigation equipment.

Type
Research Article
Copyright
Copyright © The Author(s), 2025. Published by Cambridge University Press on behalf of The Royal Institute of Navigation

1. Background

Navigation is a critical aspect of the maritime industry, with the safe and efficient movement of ships essential for global trade. Deep learning (DL) technology is increasingly applied to improve navigation and safety, reduce costs, and optimise vessel performance.

DL technology can also be applied to improve the accuracy and reliability of electronic navigation systems. With the increasing use of electronic chart display and information systems (ECDIS), it is essential to ensure that these systems are accurate and reliable (Guo et al., Reference Guo, Guo, Zhang, Zhang and Gao2020). Deep learning algorithms can analyse data from various sources, including AIS, radar and satellite imagery, to provide more accurate and reliable navigation information (Mason et al., Reference Mason, Yonel and Yazici2017; Mohanty et al., Reference Mohanty, Czakon, Kaczmarek, Pyskir, Tarasiewicz, Kunwar, Rohrbach, Luo, Prasad and Fleer2020; Murray and Perera, Reference Murray and Perera2021).

Another important application of DL technology in maritime navigation is analysing weather and oceanographic data (Rodrigues et al., Reference Rodrigues, Oliveira, Cunha and Netto2018). DL algorithms can predict weather patterns, sea states, and other environmental factors by analysing data from various sources, including satellites, buoys and weather models (Salman et al., Reference Salman, Kanigoro and Heryadi2015; Zhang et al., Reference Zhang, Yao, Zhang, Feng and Zhang2016; Ren et al., Reference Ren, Li, Ren, Song, Xu, Deng and Wang2021).

Over the past decades, consistent efforts to systematically support ship operators continued to develop and enhance onboard navigational equipment, reducing the rate of maritime accidents that cause disastrous results (Dominguez-Péry et al., Reference Dominguez-Péry, Vuddaraju, Corbett-Etchevers and Tassabehji2021). Unfortunately, maritime accidents cannot be entirely avoided since most accidents occur due to navigators' error (Rothblum, Reference Rothblum2000). Furthermore, the leading cause of ship grounding was misusing the AtoN, which implies the greater importance of the AtoN in coastal navigation (Jurdziński, Reference Jurdziński2017; Sahin and Yip, Reference Sahin and Yip2020; Choe et al., Reference Choe, Noh, Shin, Kim and Park2021).

2. Applications in navigation

2.1 Collision avoidance

Developing autonomous vessels is an essential application of DL technology in maritime navigation. Specifically, making a collision avoidance system is considered a critical stage for autonomous ships, as it is expected to make decisions impacting the safety of its own and ships in the vicinity (Bolbot et al., Reference Bolbot, Gkerekos, Theotokatos and Boulougouris2022). With the technological development of autonomous vessels, there is an increasing need for accurate and reliable navigation systems. DL algorithms and models can analyse data from sensors and other sources, allowing the vessel to navigate safely and efficiently. This technology has become an essential component in collision avoidance systems. These systems use sensors, cameras and other devices to detect potential collisions and provide warnings or take action to avoid them. The algorithms and models are used to process the sensor data and analyse them in real time to identify potential threats and determine the best course of action to avoid them.

2.2 Ship detection

Ship detection is crucial for several applications, such as maritime surveillance, vessel traffic management and environmental monitoring. Deep learning-based ship detection systems can accurately detect and identify ships in large and complex maritime scenes. DL uses Artificial Neural Networks (ANN) to perform sophisticated computations on large data sets, and it works based on the structure and function of the human brain (O'Shea and Nash, Reference O'Shea and Nash2015).

Convolutional Neural Networks (CNNs) are the most popular DL architecture in ship detection, and they can learn and extract features from images essential for ship detection (Shi et al., Reference Shi, Li, Zhang, Hu, Sun and Gao2018). In ship detection, CNNs are trained on large datasets of satellite or aerial images with and without ships. The network then learns to identify the features, distinguishing between ship and non-ship regions in the image. Zhang et al. performed ship detection within satellite images using image processing techniques and CNN classification, similar to the region-based convolutional neural networks (R-CNN) scheme (Zhang et al., Reference Zhang, Yao, Zhang, Feng and Zhang2016).

Moreover, numerous studies have attempted image-based ship detection. Lee et al. used the Viola & Jones algorithm to detect ships (Lee et al., Reference Lee, Lee, Nam and Wu2016), and Wang et al. reviewed methods for ship detection with electro-optical images at sea (Wang et al., Reference Wang, Liu and Liu2022). Escortia-Gutierrez et al. developed an effective object detection and classification using the multi-region convolutional neural networks for small ship detection (OMRCNN-SHD) technique to determine the occurrence of small ships, achieving an accuracy of 98⋅63 per cent (Escorcia-Gutierrez et al., Reference Escorcia-Gutierrez, Beleño, Jimenez-Cabas, Elhoseny, Dahman Alshehri and Selim2022).

Another approach for ship detection is using object detection algorithms, such as Faster R-CNN and You Only Look Once (YOLO). These algorithms can accurately detect and locate multiple ships in a single image. Lee et al. used the YOLO v2 in training the algorithm and used two public data sets: the Pascal Visual Object Classes (VOC) and the Singapore Maritime Data set (SMD) (Lee et al., Reference Lee, Roh, Lee, Ha and Woo2018). As a result, the Singapore maritime data had more favourable detecting results than the VOC.

2.3 AtoN detection

DL technology is increasingly being used in AtoN detection to improve the accuracy and efficiency of maritime navigation. AtoN is a visual or audible signal that helps mariners navigate safely and avoid hazards. They include buoys, lighthouses and beacons, among others aids. To a great extent, the recent CNN-based deep learning technologies are outperforming all the ever-existing algorithms (Huang et al., Reference Huang, Rathod, Sun, Zhu, Korattikara, Fathi, Fischer, Wojna, Song, Guadarrama and Murphy2016).

Aside from the CNN base, Du et al. utilised the ResNet-multiscale-attention (RMA) model to analyse the subtle and local differences among navigation mark types to recognise navigation marks. The dataset contained 10,260 images of 42 kinds of navigation marks and was resized to 240*240. This model has shown 96 per cent accuracy (Du et al., Reference Du, Sun, Qiu, Li, Pan and Chen2021). Han et al. used DL technologies to recognise the lights on navigation marks at night, prompting the proposal of an navigation marks light network (NMLNet) to capture the light features, including colour and flashing characteristics from a video clip, for the classification of navigation marks lights. Eight thousand minutes of video were used and labelled the buoys with three elements, i.e. colour, lighting phase and lighting period. The primary challenges encountered in this experiment were the video clip quality, low-resolution images, light disturbance from shore and being temporarily blinded by other ships or structures (Han et al., Reference Han, Pan, Ge, Li, Hu, Zhao and Li2021).

2.4 Gap analysis

The use of advanced navigational aids, such as radar, ECDIS and AIS, has reduced the number of crew members required on a ship. As a result, it would be difficult for a vessel's limited bridge staff to perform and respond to nearby events when sailing through congested waters (Sahin and Yip, Reference Sahin and Yip2020). Since these two systems are regarded as essential for preparing future autonomous ships, the development of collision avoidance (CA) and ship detection has been the primary focus of the most recent DL algorithm (Bolbot et al., Reference Bolbot, Gkerekos, Theotokatos and Boulougouris2022). However, coastal navigation should also be regarded as indispensable. In particular, grounding happens mainly in coastal areas, and one of the top causes is related to the misuse of AtoN (Jurdziński, Reference Jurdziński2017; Sahin and Yip, Reference Sahin and Yip2020).

3. Methodology and methods

  1. This sequence illustrates the stages of developing an AtoN detection model.

  2. 1. Choosing and selecting a working environment

  3. 2. Dataset making

  4. 3. Training

  5. 4. AtoN detection testing

  6. 5. Evaluating the trained algorithm

The process begins with choosing and selecting a working environment for the appropriate software, tools and computational resources, such as Python and deep learning frameworks. Once the environment is ready, the next step involves creating a dataset of annotated images containing various types of AtoN. This dataset is crucial for training the model, as it provides the necessary data for the model to learn to detect and classify navigation aids.

The training phase begins after the dataset is created. During this stage, the model is trained using the dataset to identify and classify AtoN by analysing the image features. The model's parameters are optimised to improve its accuracy. After training, the model undergoes AtoN detection testing, using unseen video clips to evaluate its real-time detection capabilities in dynamic, real-world scenarios.

Finally, the trained algorithm is evaluated using key metrics, such as mAP, precision, recall and IoU. This evaluation assesses the model's accuracy, effectiveness and reliability in detecting and classifying AtoN, ensuring its suitability for practical maritime applications.

3.1 Phase 1. Choosing and selecting a working environment

Google Colaboratory (Colab) is a cloud-based Jupyter notebook environment provided by Google that allows users to run Python code in a web browser without requiring any local setup (Nelson and Hoover, Reference Nelson and Hoover2020). Colab offers free access to graphics processing unit (GPU) and tensor processing unit (TPU) computing resources, making it popular for machine-learning tasks that require significant computational power (Carneiro et al., Reference Carneiro, Da Nóbrega, Nepomuceno, Bian, De Albuquerque and Reboucas Filho2018). Users can upload their Jupyter notebooks to Colab or create new notebooks directly in the Colab interface. They can also access and share publicly available notebooks from other users and pre-installed libraries that are popular in deep learning research, such as PyTorch, TensorFlow, Keras and OpenCV (Thuan, Reference Thuan2021).

YOLOv5 (You Only Look Once version 5) is an object detection model, is an improved version of the previous YOLO models, and it uses a single deep neural network to detect objects in images or videos in real time. YOLOv5 is trained on large-scale datasets and can detect objects, such as people, cars and animals. (Redmon et al., Reference Redmon, Divvala, Girshick and Farhadi2016; Redmon and Farhadi, Reference Redmon and Farhadi2016; Redmon and Farhadi, Reference Redmon and Farhadi2018; Bochkovskiy et al., Reference Bochkovskiy, Wang and Liao2020; Jocherr, Reference Jocherr2020). The algorithm has excellent learning capabilities that enable it to learn the representations of objects and apply them in object detection (Karimi, Reference Karimi2021).YOLO computes all the features of an image instead of iterating the process of classifying different regions (Du, Reference Du2018).

3.2 Phase 2. Dataset making

Roboflow is another open-source tool available on the web for free use. In order for the object-detecting programme to identify the intended objects that it has to identify, a process called annotation or labelling is required. During this process, the Roboflow offers various features, such as auto-orient, auto-resizing image and auto-adjust contrast options, using adaptive equalisation. Once all of the images have been annotated, Roboflow can generate the dataset based on the annotated images. For this research, images were collected in the Colab using image crawling code, and 355 images were collected and used in training (Figure 1). The dataset was divided into training and validation sets, with an 80–20 split. This ensures that the model is trained on a large portion of the data and tested on a separate, unseen portion to evaluate its performance.

Figure 1. Roboflow dataset

Using a dataset containing the same object as the trained object is essential for evaluating the neural network model. It is necessary to complete the crucial step of dividing the data into the training and validation sets. The images used in the training process are called the ‘training set’, and the images used in the model evaluation are called the ‘validation set’. The user proportionally divides these sets. Once the dataset-making procedure is complete, it can be directly exported to the Colab using two methods: download it as a zip file in the local drive or as a Jupyter code to copy and paste it directly into the Colab.

In addition to the personal dataset, an open-source dataset was obtained on the ‘AI-hub’ website. This dataset was made by the Korea Research Institute of Ships & Ocean Engineering (KRISO). This dataset is composed of the following items

  • 1500 h of original data video data (mkv format)

  • 5⋅4 million images of snapshots extracted from original data processed with Bounding box and polygon (jpg format)

  • Environmental and object images

3.3 Phase 3. Training

To train the AI, YOLOv5 was installed in the Colab and the Roboflow dataset created. After importing, the model was trained by a code file named ‘train.py’. Once the training was complete, a specific folder with the name coded in this phase was created in the category folder. This folder contained the weight file, which was used in applying the algorithm to the video clip. The folder was named ‘buoy_detection_results’, as coded below.

3.4 Phase 4. AtoN detection testing

After the training, the trained algorithm applied the sample video clip to verify whether it could adequately detect the marks. First, several video clips were obtained to test the trained algorithm. Specific keywords were used to filter the videos to view the required ones. The video search included terms such as ‘channels’, ‘rivers’ and ‘coastal waters’ that specifically indicate the regions. The video must be related to sailing in coastal and inland waters where the AtoN is primarily found. Because the buoys in the videos were too small and were not visible in the low-resolution videos, future videos must have a high resolution, such as 4 K or Full HD class. Video clip length was another consideration in the filter. Most of the lengthy videos did not mainly depict the waterways where the buoys could be seen but nature scenes while transiting through channels. Most of those videos featured tourist destinations and other locations unrelated to this study. Therefore, specific keywords, high video resolution, and video length were used for this research to filter the videos. With this guide, four videos were chosen and used to test the algorithm. Those videos are ‘a5. Masan port navigation. Narrow channel. Masan passage. Fairway. Ship’, ‘Time Lapse Port of Rotterdam Dolwin Alpha’, ‘Back in Port Qasim, Pakistan on Maersk Container Vessel after almost 2years’, and ‘Giant Merchant vessel passing Columbia River channel in USA’.

During the training phase, buoy accuracy of less than 30 per cent will not be shown with the code ‘--cfg 0.3’. This code was added to make sure that other things that look like the buoy would not be recognised as such.

Different codes were written in the Colab with the detection training to mark the first red buoy with R1 and the second with R2. After that, a line representing the buoy-detected channel limit connected the two points. The same procedure was done for the green buoys as G1 and G2. When the ship reached the G1(R1) point, the subsequent G2 transforms into G1. The guideline was formed when the first two green/red buoys were captured in one frame.

3.5 Phase 5. Evaluating the trained algorithm

Evaluating a deep learning model's performance is vital to ensure reliability and effectiveness in real-world applications. Verifying the accuracy and quality of object detection is crucial in ensuring that the detection system is functioning correctly and providing reliable results. In AtoN detection, metrics such as accuracy, precision, recall and overall performance are used to measure the model's performance. This section discusses the details of these metrics and how they apply to the YOLOv5 algorithm used for AtoN detection.

Key evaluation metrics in machine learning, such as mean average precision (mAP), precision, recall, and intersection over union (IoU) are essential for assessing model performance. These metrics provide insights into how well a model predicts outcomes, balances errors and generalises to unseen data. The mAP metrics concisely describe object detection accuracy and quality (Hammedi et al., Reference Hammedi, Ramirez-Martinez, Brunet, Senouci and Messous2019). The mAP is calculated by first computing the average precision (AP) for each object class in the dataset. The average precision is the area under the precision-recall curve, which plots the precision against the recall for different detection threshold values. Higher mAP values indicate better performance, indicating that the detection system can accurately detect objects across all classes. The lower mAP values suggest that the system may miss some objects or produce many false positives.

Precision, defined as the ratio of true positive detections to the total number of positive detections (true positives plus false positives), indicates the accuracy of the model's detections. High precision in AtoN detection minimises false alarms by ensuring that most of the detected buoys or navigation aids are correct. On the other hand, recall measures the ratio of true positive detections to the total number of actual positives (true positives plus false negatives), emphasising the model's ability to identify all relevant buoys or navigation aids. A high recall rate is crucial for ensuring navigational safety by reducing the risk of missing any actual objects. Figure 2 shows that the model's average mAP, precision and recall are about 0⋅8, where 1 is the maximum. The model created was evaluated using these metrics throughout its development.

Figure 2. Metrics calculation

4. Results

The test results were analysed by reviewing the detection accuracy and the formation of guidelines in each video. The mAP was calculated to quantify the overall performance. Additionally, qualitative observations were made to identify cases where the algorithm performed well and where it struggled (e.g. due to low resolution or obstructions).

Rotterdam pilot Marijin van Hoorn took the first video, ‘Time Lapse Port of Rotterdam Dolwin Alpha,’ on 26 May 2013, while shifting ‘Heerema Barge H542’ from Dordrecht to Schiedam. The resolution of each video was 1,920 × 1,080. When two buoys (G1, G2, and R1, R2) were identified in a particular video frame, the guidelines successfully connected the two buoys with a line. Most of the buoys were correctly identified, and the guidelines successfully connected the two buoys with a line (Figure 3).

Figure 3. ‘Time Lapse Port of Rotterdam Dolwin Alpha’

The camera on the top deck took the second video, ‘a5. Masan port navigation. Narrow channel. Masan passage. Fairway. Ship’. It appears to be a Ro-Ro ship or car carrier, based on the ship's height where the video was taken. The video had a resolution of 1,280 × 720. After applying the algorithm in the video, most of the buoys in the videos were appropriately detected, and the guideline was formed successfully on the red buoys (Figure 4). Also, the guideline disappeared when the small boat covered the red buoy (Figure 5).

Figure 4. ‘a5. Masan port navigation. Narrow channel. Masan passage. Fairway. Ship’

Figure 5. Red buoy covered by a small boat

The third video clip, ‘Giant Merchant vessel passing Columbia River channel in USA’, was taken when a bulk carrier passed the Columbia River channel. This video was taken from the bridge, but the green buoys could not be identified due to a derrick crane on the deck. The buoys were detected, but no guideline was formed in this video since the distance between the buoys was too great (Figure 6). The video was taken with a resolution of 1,920 × 1,080. At least two buoys must be captured in the same frame to form a guideline.

Figure 6. ‘Giant Merchant vessel passing Columbia River channel in USA’

The last video, titled ‘Back in Port Qasim, Pakistan on Maersk Container Vessel after almost 2 years,’ was captured while sailing in the fairway channel. This video was taken from the bridge of a container ship that was only partially loaded. The resolution of the video was 960 × 720. The video has a resolution that is significantly lower than that of the other three videos. The algorithm detected some of the buoys. However, not all buoys were detected or identified as coded, even though they were visible in the video (Figure 7). The buoys could only be identified when the vessel's forecastle was close because the video resolution was insufficient to distinguish between the colours of the buoys in the far range.

Figure 7. ‘Back in Port Qasim, Pakistan on Maersk Container Vessel after almost 2 years’

5. Discussions

A total of four video clips were used in the research for the sole purpose of verifying the algorithm. The first and second videos were taken from the top deck of a ship, and two other videos were taken from the bridge. In all videos, there were no significant vibrations that affected the quality of detection. However, the second video was taken with a handheld camera. A few factors, such as the camera's location, detection range and guideline formation, have been highly influential in this examination.

The camera's location is critical for the early detection of the buoys. The detection rate of the first two videos rate is improved when the camera is installed on the foreside. The camera installed in the accommodation or bridge had a slower detection time. However, if the camera location is in the forecastle, there might be heavier vibration when the vessel is underway. Therefore, the camera should be installed in the front of the ship, preferably on the forecastle handrail or foremast, with a measure that could mitigate the ship's vibration. Also, the distance between the forecastle and the accommodation area should be considered since that long distance affects the buoy detection rate and video quality.

The guideline connecting the two buoys was coded to be drowned if the two same colour buoys were detected in one frame. However, this was not fulfilled because the distance between buoys was too far, or the second buoy needed to be larger to be identified as a buoy. In addition, the confidence rate of the farther buoys was unstable with irregular movement. Nevertheless, the rate increased when the ship approached the buoy, making it stable in an acceptable range. Moreover, the guideline disconnected or disappeared when the detected buoys were covered by other ships or objects.

Another critical aspect of guideline formation was the buoys detection range. The guideline cannot be formed when the next buoy in sequence is too far or too small to be detected. When the next buoy is not displayed in standard view, a specific programme is required to automatically or manually zoom in to search for the next buoy. A camera with a higher resolution is necessary to improve the initial detection range and obtain a higher accuracy rate since the guideline forms only when two buoys are detected.

Only recorded video clips from the online source YouTube were used in the research, thus making it impossible to control camera position. Real-time and onsite testing should be conducted with the high-resolution camera installed on the forecastle to verify the algorithm's performance and compare the detection performance when the camera is installed on other locations, such as the wheelhouse or top deck.

6. Conclusions

Deep learning technology has many applications in maritime navigation, from vessel tracking to autonomous vessel operations and weather prediction forecasts. As the maritime industry embraces digitalisation, deep learning will become an increasingly important tool for improving navigation safety and preventing misuse and misinterpretation of the AtoN to prevent grounding in coastal waters.

To accomplish this, this study looked at previous research to identify gaps in DL methods used in maritime navigation, particularly AtoN detection. It discovered that short-handed navigators sailing busy coastal waters could benefit from integrating the DL technology for detecting and analysing AtoN.

This study was conducted through Google Colab and YOLOv5 to code an algorithm to identify and analyse the AtoN and create a guideline connecting the detected buoys. Four videos were used to check the algorithm, showing that it worked in a matching condition and requires further improvements. The evaluation metrics comprehensively understand the YOLOv5 model's performance in detecting AtoN. By leveraging these metrics, the study ensures that the proposed DL algorithm can reliably enhance navigational safety by accurately detecting and classifying navigation aids. Further improvements and real-world testing can help refine these metrics and improve the model's reliability in diverse maritime conditions.

During the training phase, the model was trained on a dataset of annotated images containing different types of AtoN, learning to identify and classify navigation aids based on image features. In the validation phase, the model's performance was assessed on a separate set of images, where metrics such as mAP, precision and recall were calculated to determine its accuracy and effectiveness.

Finally, the model was tested on four video clips from online sources to evaluate its real-time detection capabilities. The same metrics were used to measure its ability to detect and accurately mark buoys in the video frames. The model verification results indicated varying levels of detection accuracy, with mAP values around 0.8 reflecting reasonably good performance. The calculated metrics suggest the model's balanced trade-off between precision and recall across different AtoN classes.

The camera's location was one of the constraints that contributed to the buoys' early detection: Due to the fluctuating confidence rate and false detection of the marked buoys, the algorithm had to resolve reliability issues. Additionally, a camera with a higher resolution is strongly recommended because it allows the algorithm to locate the buoy from a greater distance. As there is a strong correlation between misusing the AtoN or misinterpreting the navigation buoy that causes the vessel to go aground, the assistance of the algorithm that analyses the AtoN and creates a guideline to prevent the navigators from misusing the navigational aids can also prevent grounding.

Competing interests

The authors report there are no competing interests to declare.

References

a5. Masan port navigation. Narrow channel. Masan passage. Fairway. Ship. Video clip from Youtube. https://www.youtube.com/watch?v=Vyc5X7noePQGoogle Scholar
Back in Port Qasim, Pakistan on Maersk Container Vessel after almost 2 years. Video clip from Youtube. https://www.youtube.com/watch?v=VLpy9L1NyJY&t=3sGoogle Scholar
Bochkovskiy, A., Wang, C. Y. and Liao, H. Y. M. (2020). YOLOv4: Optimal speed and accuracy of object detection. arXiv (Cornell University). [Preprint]. Available at https://arxiv.org/pdf/2004.10934v1Google Scholar
Bolbot, V., Gkerekos, C., Theotokatos, G. and Boulougouris, E. (2022). Automatic traffic scenarios generation for autonomous ships collision avoidance system testing. Ocean Engineering, 254, 111309. doi:10.1016/j.oceaneng.2022.111309CrossRefGoogle Scholar
Carneiro, T., Da Nóbrega, R. V. M., Nepomuceno, T., Bian, G. B., De Albuquerque, V. H. C. and Reboucas Filho, P. P. (2018). Performance analysis of google colaboratory as a tool for accelerating deep learning applications. IEEE Access, 6, 6167761685. doi:10.1109/access.2018.2874767CrossRefGoogle Scholar
Choe, C., Noh, Y., Shin, D., Kim, H. and Park, H. (2021). Identifying risk factors of marine accidents in coastal area by marine accident types. Daehan Gyotong Haghoeji, 39(4), 540554. doi:10.7470/jkst.2021.39.4.540Google Scholar
Dominguez-Péry, C., Vuddaraju, L. N. R., Corbett-Etchevers, I. and Tassabehji, R. (2021). Reducing maritime accidents in ships by tackling human error: A bibliometric review and research agenda. Journal of Shipping and Trade, 6(1), 132. doi:10.1186/s41072-021-00098-yCrossRefGoogle Scholar
Du, J. (2018). Understanding of object detection based on CNN family and YOLO. Journal of Physics: Conference Series, 1004, 012029. doi:10.1088/1742-6596/1004/1/012029Google Scholar
Du, Y., Sun, S., Qiu, S., Li, S., Pan, M. and Chen, C. (2021). Intelligent recognition system based on contour accentuation for navigation marks. Wireless Communications and Mobile Computing, 2021, 111. doi:10.1155/2021/6631074CrossRefGoogle Scholar
Escorcia-Gutierrez, J., Beleño, K., Jimenez-Cabas, J., Elhoseny, M., Dahman Alshehri, M. and Selim, M. M. (2022). Intelligent deep learning-enabled autonomous small ship detection and classification model. Computers & Electrical Engineering, 100, 107871. doi:10.1016/j.compeleceng.2022.107871CrossRefGoogle Scholar
Giant Merchant vessel passing Columbia River channel in USA. Video clip from Youtube. https://www.youtube.com/watch?v=I4cKzeeKG5EGoogle Scholar
Guo, M., Guo, C., Zhang, C., Zhang, D. and Gao, Z. (2020). Fusion of ship perceptual information for electronic navigational chart and radar images based on deep learning. The Journal of Navigation, 73(1), 192211.CrossRefGoogle Scholar
Hammedi, W., Ramirez-Martinez, M., Brunet, P., Senouci, S. M. and Messous, M. A. (2019). Deep Learning-Based Real-Time Object Detection in Inland Navigation. In 2019 IEEE Global Communications Conference (GLOBECOM). IEEE, pp. 16.CrossRefGoogle Scholar
Han, X., Pan, M., Ge, H., Li, S., Hu, J., Zhao, J. and Li, Y. (2021). Multilabel video classification model of navigation mark's lights based on deep learning. Computational Intelligence and Neuroscience, 2021, 6794202, 13 pages, doi:10.1155/2021/6794202CrossRefGoogle ScholarPubMed
Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S. and Murphy, K. (2016). Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors.CrossRefGoogle Scholar
Jocherr, G. (2020). YOLOv5. (GitHub) Retrieved from https://docs.ultralytics.com/Google Scholar
Jurdziński, M. (2017). Causes of ships groundings in terms of integrated navigation model. Annual of Navigation, 24, 119135.CrossRefGoogle Scholar
Karimi, G. (2021). ‘Introduction to YOLO Algorithm for Object Detection’. Retrieved from https://www.section.io/engineering-education/introduction-to-yolo-algorithm-for-object-detection/Google Scholar
Lee, J. M., Lee, K. H., Nam, B. and Wu, Y. (2016). Study on Image-Based Ship Detection for AR Navigation. 2016 6th International Conference on IT Convergence and Security (ICITCS). IEEE, pp. 1. doi:10.1109/ICITCS.2016.7740373.CrossRefGoogle Scholar
Lee, S., Roh, M., Lee, H., Ha, J. and Woo, I. (2018). Image-based Ship Detection and Classification for Unmanned Surface Vehicle Using Real-Time Object Detection Neural Networks. The 28th International Ocean and Polar Engineering Conference.Google Scholar
Mason, E., Yonel, B. and Yazici, B. (2017). Deep Learning for Radar. 2017 IEEE Radar Conference (RadarConf), pp. 17031708.CrossRefGoogle Scholar
Mohanty, S. P., Czakon, J., Kaczmarek, K. A., Pyskir, A., Tarasiewicz, P., Kunwar, S., Rohrbach, J., Luo, D., Prasad, M. and Fleer, S. (2020). Deep learning for understanding satellite imagery: An experimental survey. Frontiers in Artificial Intelligence, 3, 534696.CrossRefGoogle ScholarPubMed
Murray, B. and Perera, L. P. (2021). An AIS-based deep learning framework for regional ship behavior prediction. Reliability Engineering & System Safety, 215, 107819.CrossRefGoogle Scholar
Nelson, M. J. and Hoover, A. K. (2020). Notes on Using Google Colaboratory in AI Education. Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education, pp. 533534.CrossRefGoogle Scholar
O'Shea, K. and Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.Google Scholar
Redmon, J. and Farhadi, A. (2016). YOLO9000: Better, faster, stronger. arXiv preprint arXiv, 1612.08242.Google Scholar
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767.Google Scholar
Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779788.CrossRefGoogle Scholar
Ren, X., Li, X., Ren, K., Song, J., Xu, Z., Deng, K. and Wang, X. (2021). Deep learning-based weather prediction: A survey. Big Data Research, 23, 100178.CrossRefGoogle Scholar
Rodrigues, E. R., Oliveira, I., Cunha, R. and Netto, M. (2018). DeepDownscale: A Deep Learning Strategy for High-Resolution Weather Forecast. 2018 IEEE 14th International Conference on e-Science (e-Science), pp. 415422.CrossRefGoogle Scholar
Rothblum, A. M. (2000). Human Error and Marine Safety. National Safety Council Congress and Expo, Orlando, FL.Google Scholar
Sahin, B. and Yip, T. (2020). Analysis of Root Causes for Maritime Accidents Originated From Human Factor. IAME 2020 Conference, 10-13 June, PolyU, Hong Kong.Google Scholar
Salman, A. G., Kanigoro, B. and Heryadi, Y. (2015). Weather Forecasting Using Deep Learning Techniques. 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 281285.CrossRefGoogle Scholar
Shi, Q., Li, W., Zhang, F., Hu, W., Sun, X. and Gao, L. (2018). Deep CNN with multi-scale rotation invariance features for ship classification. IEEE Access, 6, 3865638668.CrossRefGoogle Scholar
Thuan, D. (2021). Evolution of Yolo Algorithm and Yolov5: The State-of-the-Art Object Detention Algorithm. Retrieved from https://www.theseus.fi/handle/10024/452552Google Scholar
Time Lapse Port of Rotterdam Dolwin Alpha. Video clip from Youtube. https://www.youtube.com/watch?v=SadEpzNfxtk&t=168sGoogle Scholar
Wang, X., Liu, J. and Liu, X. (2022). Ship feature recognition methods for deep learning in complex marine environments. Complex & Intelligent. Systems, 8, 38813897.CrossRefGoogle Scholar
Zhang, R., Yao, J., Zhang, K., Feng, C. and Zhang, J. (2016). S-CNN-based Ship Detection From High-Resolution Remote Sensing Images. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, Vol. 41.Google Scholar
Figure 0

Figure 1. Roboflow dataset

Figure 1

Figure 2. Metrics calculation

Figure 2

Figure 3. ‘Time Lapse Port of Rotterdam Dolwin Alpha’

Figure 3

Figure 4. ‘a5. Masan port navigation. Narrow channel. Masan passage. Fairway. Ship’

Figure 4

Figure 5. Red buoy covered by a small boat

Figure 5

Figure 6. ‘Giant Merchant vessel passing Columbia River channel in USA’

Figure 6

Figure 7. ‘Back in Port Qasim, Pakistan on Maersk Container Vessel after almost 2 years’