A dynamic adaptive grating algorithm for AIS-based ship trajectory compression

Yuanyuan Ji; Le Qi; Robert Balling

doi:10.1017/S0373463321000692

A dynamic adaptive grating algorithm for AIS-based ship trajectory compression

Published online by Cambridge University Press: 18 October 2021

and

Yuanyuan Ji: Affiliation:
School of Geographical Sciences and Urban Planning, Arizona State University, Tempe, AZ 85281, USA.
Le Qi*: Affiliation:
School of Navigation, Wuhan University of Technology, Wuhan 430063, China. Hubei Key Laboratory of Inland Shipping Technology, Wuhan 430063, China
Robert Balling: Affiliation:
School of Geographical Sciences and Urban Planning, Arizona State University, Tempe, AZ 85281, USA.
*: *Corresponding author. E-mail: leqiem@hotmail.com

Article contents

Abstract
Introduction
Methodology
Test and compression results
Conclusion
References

Rights & Permissions

Abstract

Automatic identification system (AIS)-based ship trajectory data are important for analysing maritime activities. As the data accumulate over time, trajectory compression is needed to alleviate the pressure of data storage, migration and usage. The grating algorithm, as a vector data compression algorithm with high compression performance and low computation complexity, has been considered as a very promising approach for ship trajectory compression. This algorithm needs the threshold to be set for each trajectory which limits the applicability over a large number of different trajectories. To solve this problem, a dynamic adaptive threshold grating compression algorithm is developed. In this algorithm, the threshold for each trajectory is dynamically generated using an effective approaching strategy. The developed algorithm is tested with a complex trajectory dataset from the Qiongzhou Strait, China. In comparison with the traditional grating method, our algorithm has improved advantages in the ease of use, the applicability to different trajectories and compression performance, all of which can better support relevant applications, such as ship trajectory data storage and rapid cartographic display.

Keywords

ship trajectory dynamic adaptive threshold grating algorithm vector data compression

Type: Research Article
Information: The Journal of Navigation , Volume 75 , Issue 1 , January 2022 , pp. 213 - 229

DOI: https://doi.org/10.1017/S0373463321000692 [Opens in a new window]
Copyright: Copyright © The Author(s), 2021. Published by Cambridge University Press on behalf of The Royal Institute of Navigation

1. Introduction

The maritime transportation industry continues to see an increased focus on traffic safety, green shipping, marine environmental protection and overall sustainable development (Chen et al., Reference Chen, Zheng, Garg, Xu, Li and Fei2019a, Reference Chen, Zhang, Wan, Li, Huang and Fei2019b; McWhinnie et al., Reference McWhinnie, O'Hara, Hilliard, Le Baron, Smallshaw, Pelot and Canessa2021). Ship trajectory data, often collected by an automatic identification system (AIS), can reflect the locations and behaviour of ships. Based on ship trajectory data, the information can be processed to help with maritime activities (Fournier et al., Reference Fournier, Hilliard, Rezaee and Pelot2018) including location modelling for maritime search and rescue vessels (Pelot et al., Reference Pelot, Akbari and Li2015; Akbari et al., Reference Akbari, Pelot and Eiselt2018), estimation of the ship domain under different conditions to avoid collision (Hörteborn et al., Reference Hörteborn, Ringsberg, Svanberg and Holm2019; Qi et al., Reference Qi, Zheng and Li2011), traffic flow analysis (Qi et al., Reference Qi, Zheng and Gang2017, Reference Qi, Ji, Balling and Xu2021), ship trajectory prediction for traffic organisation optimisation (Qi and Zheng, Reference Qi and Zheng2016a; Alizadeh et al., Reference Alizadeh, Alesheikh and Sharif2021), green development of ports (Hua et al., Reference Hua, Chen, Wan, Xu, Bai, Zheng and Fei2020; Liu et al., Reference Liu, Liu and Qi2021), monitoring of fishing activities (Hu et al., Reference Hu, Jiang, de Souza, Pelot and Matwin2016; Rezaee et al., Reference Rezaee, Brooks and Pelot2017), and others.

However, the frequency of AIS messages is relatively high as an ordinary shipborne AIS device sends a message every two seconds to three minutes (Harati-Mokhtari et al., Reference Harati-Mokhtari, Wall, Brooks and Wang2007; Tichavska et al., Reference Tichavska, Cabrera, Tovar and Araña2015; Alizadeh et al., Reference Alizadeh, Alesheikh and Sharif2021). Accordingly, this generates a large amount of data and contains redundant information due to relatively slow ship movements thereby taking up large amounts of disk space and restricting the convenience and flexibility of the use of AIS data (Mazzarella et al., Reference Mazzarella, Arguedas and Vespe2015; Fiorini et al., Reference Fiorini, Capata and Bloisi2016; Qi and Zheng, Reference Qi and Zheng2016b). Therefore, ship trajectory data often need to be compressed and simplified to reduce redundant data and save storage space (Zhang et al., Reference Zhang, Liu, Cai, Wu and Shi2016). Ship trajectory data compression can alleviate the data storage pressure caused by long-term data accumulation which can help data migration, backup or use. Data compression can also simplify the visualisation and mapping of ship trajectory data, thereby reducing the load on computer memory and the graphics card.

Research on trajectory data compression mainly falls into three types. First is general vector data compression research, which considers the distance, angle and/or directional relationship of the points in the trajectory. This type of research is usually generalised and adaptable to the ship trajectories in different waters. Researchers have proposed a variety of approaches including the choosing interval points algorithm, limiting vertical distance algorithm, limiting angle algorithm, offset angle algorithm, the Douglas–Peucker (DP) algorithm, the grating algorithm and others (Gudmundsson et al., Reference Gudmundsson, Katajainen, Merrick, Ong and Wolle2009; Ji and Wang, Reference Ji and Wang2010; Popa et al., Reference Popa, Zeitouni, Oria and Kharrat2015). Among them, the compression errors of the DP and grating algorithms are smaller than the others (Veregin, Reference Veregin2000; Liu et al., Reference Liu, Li, Yang, Wu, Liu and Liu2019). The DP algorithm has reasonably good compression performance and is widely used. This approach requires all the points of the trajectory to be loaded to compute their Euclidean distance. The computational complexity is $\textrm {O}(n \log n)$ or even $\textrm {O}(n^2$), and large memory and storage space are needed to execute the algorithm (Muckell et al., Reference Muckell, Hwang, Patil, Lawson, Ping and Ravi2011, Reference Muckell, Olsen, Hwang, Lawson and Ravi2014; Singh et al., Reference Singh, Aggarwal, Saxena and Prakash2017). The grating algorithm processes three points at one time, so the computational complexity is O($n$) resulting in an algorithm that requires low memory and has more flexibility and application potential (Ji et al., Reference Ji, Xu and Deng2019; Qi and Ji, Reference Qi and Ji2020).

The second type of research on ship trajectory simplification focuses attention on ship motion characteristics in specific waters or circumstances, such as slow in ports. Studies of this type are usually more detailed in the local water area and are used for specific purposes. In order to retain constant information about speed, heading, position and path of the vessel in ports or other traffic-dense areas, a data reduction algorithm was developed (Ifrim et al., Reference Ifrim, Iuga, Pop, Wallace and Poulopoulos2017). To preserve vessel course alteration, a compression algorithm based on feature point recognition was proposed (Popa et al., Reference Popa, Zeitouni, Oria and Kharrat2015). Multiple dimensions (position, direction, speed) were used to optimise the directed-acyclic-graph-based online trajectory simplification algorithm (Zhang et al., Reference Zhang, Shi, Li and Zhang2020). A trajectory simplification method that better encodes complex behaviours and possibly retains the interesting part in full resolution was proposed (Sánchez-Heres, Reference Sánchez-Heres2019).

The third type of research facilitates the application or usage of trajectory compression, such as compression threshold determination, similarity measure and/or trajectory clustering. An AIS-based method for determining compression algorithm thresholds for the minimum ship domain was proposed (Zhang et al., Reference Zhang, Liu, Cai, Wu and Shi2016). An automatic thresholding was developed based on the channel and trajectory characteristics (Liu et al., Reference Liu, Li, Yang, Wu, Liu and Liu2019). A threshold determining method was studied based on the uncertainty and confidence value associated with each candidate stop in the trajectory (Milaghardan et al., Reference Milaghardan, Abbaspour and Claramunt2018). A trajectory similarity measurement algorithm based on feature point detection and synthetic distance was proposed to measure the similarity between trajectories (Qi and Zheng, Reference Qi and Zheng2016c). A similarity method considering the spatio-temporal behaviour of trajectories was used to find trajectories with acceptable similarity (Abbaspour et al., Reference Abbaspour, Shaeri and Chehreghan2017; Moayedi et al., Reference Moayedi, Abbaspour and Chehreghan2019). A similarity measure considering the course change and the meaning at the route level was developed (Zhao and Shi, Reference Zhao and Shi2019a). A clustering method based on compression and density was proposed to achieve improved marine traffic pattern recognition on ship trajectories (Zhao and Shi, Reference Zhao and Shi2019b).

Our research belongs to the aforementioned first and third types. In the first type of research, on the basis of our previous research on the comparison of different compression algorithms (Ji et al., Reference Ji, Xu and Deng2019; Qi and Ji, Reference Qi and Ji2020), we found that the grating compression algorithm has the performance advantages of low algorithm complexity, high compression performance and less requirement for computer memory. We pioneered the use of the grating algorithm for ship trajectory compression. In the third type of research, we improved the grating method to simplify the usage of the algorithm and optimise the compression performance. In this paper, we propose a dynamic adaptive grating algorithm that can automatically generate an optimal threshold for each trajectory in the dataset based on the users’ expectancy or requirement for the overall compression performance.

This paper is structured such that Section 2 states the methodology of the algorithm, Section 3 shows the test and the results of performances of the proposed algorithm, and Section 4 concludes the contribution of this paper and prospects the future work.

2. Methodology

2.1 Traditional grating compression algorithm

The general approach of the grating algorithm (Ji et al., Reference Ji, Xu and Deng2019) is to process three points (first, middle and last points) through a sector pattern. The connection between the first point and the middle point is the central axis of the sector to judge whether the last point is within the sector opening angle (the opening angle is obtained by the input threshold). If the last point is within the sector opening angle, it is considered that the trajectory has little change, there is information redundancy and therefore data compression is performed by deleting the middle point. If the last point is out of the sector opening angle, it is considered that the trajectory has a non-negligible change and therefore the middle point is retained. That middle point is then taken as the new first point to form a new sector to judge the relative position of the next point, and so on, until the execution reaches the last point of the trajectory. The schematic diagram of the grating algorithm compression principle is shown in Figure 1.

Figure 1. Schematic diagram of the grating algorithm principle: (a) the trajectory; (b) compress $P_1$, $P_2$, $P_3$; (c) compress $P_1$, $P_3$, $P_4$; (d) compress $P_1$, $P_4$, $P_5$; (e) compress $P_4$, $P_5$, $P_6$; (f) compression result

Figure 1(a) is a schematic diagram of a trajectory, with trajectory points $P_1,P_2, \ldots ,P_{15}$ distributed in a non-linear fashion. In Figure 1(b), the geospatial relationship of points $P_1$, $P_2$, $P_3$ is judged by the sector, whose vertex is $P_1$, central axis is $P_1P_2$ and opening angle is the threshold. Because $P_3$ is in the sector, $P_2$ is considered compressible. As Figure 1(c) shows, $P_2$ is deleted, and $P_1$ is still kept as the first point, $P_3$ is set as the new middle point, so $P_1$ and $P_3$ form a sector (the blue area), and it intersects with the previous sector (the grey area) to form the new sector (yellow area). This new sector is used to judge the geospatial relationship with point $P_4$. Because $P_4$ is in the sector, the middle point $P_3$ is deleted. In Figure 1(d), $P_3$ is compressed, so $P_4$ is set as the new middle point. With the central axis $P_1$ $P_4$, the sector is formed as the blue area, and the intersect of the blue sector with the previous grey sector is shown as the yellow area, which is taken as the new sector for this process. Because $P_5$ is outside of the sector, $P_4$ needs to be retained. In Figure 1(e), since $P_4$ is kept, it is taken as a new start point, and then $P_5$ is the new middle point. They form a new sector to judge whether $P_6$ is in the sector or not, which is same to the situation shown in Figure 1(b). The final result is shown in Figure 1(f) in which the white hollow points are the points that have been compressed and deleted, and the black solid points are those that have been retained. It can be seen that the grating method retains the turning characteristics of the trajectory very well.

The algorithm has the characteristics of memorability, as well as a sensitivity to the angle changes. When the trajectory is relatively smooth and there are discarded compression points, the new sector will be the intersection of the current sector and the previous sector(s), and therefore, the new sector retains the features of the previous sector(s). The characteristic of the sensitivity to the angle changes is reflected in the dynamical adjustment of the sector pattern. As shown in Figure 1(c), the central axis of the current sector and the previous sector are $P_1P_3$ and $P_1P_2$, respectively, but their directions are slightly different. The algorithm reflects the changes by narrowing down the opening angle dynamically.

In complex navigation waterways, there are many shapes of trajectories, which may be straight, curved or turning. If these trajectories with large differences in shape use the same threshold, the error distribution may be great for some of them because of the improper threshold. And since the threshold is the input, it is unrealistic to set the thresholds one by one to best fit the performance of each trajectory, and it would be time-consuming and labour-intensive to try to determine the best threshold repeatedly. These facts lead to a reduction in the performance or the applicability when the grating algorithm is used to compress a large dataset with different trajectories.

However, due to its excellent compression performance and fast computation demonstrated in practical use, the grating algorithm is considered as a promising algorithm worthy for further promotion and application. In this paper, we commit the effort on adaptive thresholding to dynamically assign the optimal threshold to each trajectory, to improve the algorithm in ease of use, better generalisation to the complex dataset with different trajectories and improvement in the compression performance.

2.2 Dynamic adaptive grating (DAG) compression algorithm

2.2.1 Definition of threshold

To determine the threshold, we must first clarify the specific definition of the threshold for calculation. As the above context states, the opening angle is the threshold. To simplify the threshold to a more understandable unit, we convert the threshold from the opening angle to the length of its opposite side, as shown in Figure 2.

Figure 2. Schematic diagram of the sector pattern

Figure 2 shows that the starting point $P_1$ and the middle point $P_2$ are connected as a central axis, with the starting point as a vertex and angle $\theta$ as an opening angle, so a sector is formed. We take the length of the side opposite the angle $\theta$ as the threshold $T$, as shown in Figure 2. The threshold $T$ can be obtained by Equation (1):

(1)\begin{equation} T=2D_{p_1p_2}\cdot\tan\left( \frac{\theta}{2} \right), \end{equation}

where $D_{p_1p_2}$ is the length between point $P_1$ and $P_2$.

Because using a threshold for data processing is often to achieve a certain goal or performance (such as error or compression ratio) (Goh et al., Reference Goh, Basah, Yazid, Safar and Saad2018), we associate the compression performance with the selection of the threshold. In order to automatically obtain the threshold that adapts to compression requirements, an approximation method was developed to achieve this goal; the approximation method can use the compression performance as feedback to adaptively approximate the desired threshold.

2.2.2 Calculation of the candidate threshold

In order to derive the desired threshold, the compression performance can be taken as a guide. For instance, we take the compression error as the requirement of compression and require that the practical compression error not be greater than a setting value $E$. It was found that the compression error is nearly positively correlated with the threshold (refer to Figure 8). We use the linear relationship as the function to simplify the algorithm complexity, as shown in Equation (2), then the candidate threshold $T_i$ is obtained by Equation (3):

(2)\begin{align} & \frac{T_i-T_{\text{min}}}{E-E_{\text{min}}}=\frac{T_{\text{max}}-T_{\text{min}}}{E_{\text{max}}-E_{\text{min}}}, \end{align}

(3)\begin{align} & T_i=\frac{T_{\text{max}}-T_{\text{min}}}{E_{\text{max}}-E_{\text{min}}}\left(E-E_{\text{min}}\right)+T_{\text{min}}, \end{align}

where $T_i$ is the candidate threshold, $i$ ($i = 0, 1, 2,\ldots$) is the label representing the sequence of the candidate threshold; $T_{\textrm {min}}$ is the lower boundary of the threshold, usually set to 0; $E_{\textrm {min}}$ is the lower boundary of the error equal to 0 when $T_{\textrm {min}}$ is 0; $T_{\textrm {max}}$ is the upper boundary of the threshold, because it corresponds to the opening angle of the sector, it can be obtained by setting the angle no greater than $180^\circ$ in Equation (1); and $E_{\textrm {max}}$ is the upper boundary of the error, which is calculated by inputting $T_{\textrm {max}}$ to the compression algorithm.

2.2.3 Evaluation of the compression performance

The candidate threshold $T_i$, which is calculated from Equation (3), is then added into the compression algorithm to obtain the compressed trajectory, as shown in Equation (4):

(4)\begin{equation} (x_c, y_c) =f_{AG}(x_0, y_0, T_i ), \end{equation}

where $(x_0, y_0)$ is the coordinate pair of the original trajectory, $(x_c, y_c)$ is the coordinate pair of the compressed trajectory and $f_{AG}$ is the grating compression algorithm.

Then the performance corresponding to the threshold is evaluated. When we take the compression error as an example, the area error is used as the performance index. Different from the length error, the area error refers to the area of all closed areas formed by the intersection of the compressed and uncompressed trajectories, which can effectively avoid the possibility of reflecting the difference when the starting point and the ending point of the trajectories are very close to each other. The equation of area error is shown as Equation (5):

(5)\begin{equation} \varepsilon_i=f_{p}(x_0, y_0, x_c, y_c), \end{equation}

where $\varepsilon _i$ is the actual error corresponding to the candidate threshold $T_i$ and $f_P$ is the error function.

Another commonly used performance index is the data compression ratio and the equation is as follows:

(6)\begin{equation} r=\frac{N(x_0, y_0)}{N(x_c, y_c)}, \end{equation}

where $r$ is the data compression ratio, $N(x_0, y_0)$ is the number of points in the uncompressed trajectories and $N(x_c, y_c)$ is the number of points in the compressed trajectories.

2.2.4 Performance comparison and approximation

If $\varepsilon _i< E$, that is, the compression performance index meets the requirements, then the corresponding candidate threshold is the adaptive threshold. However, if $\varepsilon _i>E$, $T_i$ and $\varepsilon _i$ will be taken as the new $T_{\textrm {max}}$ and $E_{\textrm {max}}$, forming a new linear relationship with $T_{\textrm {min}}$ and $E_{\textrm {min}}$, to continue to approach the desired threshold until the candidate meets the requirements. The diagram of threshold approximation is shown in Figure 3(a).

Figure 3. Schematic diagram of threshold approximation

However, there is an issue. If the relationship between the threshold and the compression error is the concave function shown as Figure 3(a), it would be difficult to approximate the error which is less than $E$. This issue may lead to numerous or even endless iterations of approximation. In order to solve this problem, we propose to use $E'$ instead of $E$ as input, let:

(7)\begin{equation} E'=k\cdot E, \end{equation}

where $\forall k\in [ 0,1)$. The larger the value of $k$, the slower the approximation speed, but the smaller the deviation between the outcome error and expected error $E$; and vice versa. Here, $E$ is still used as the basis for judging, as shown in Figure 3(b). Figure 3(a) shows that, after 3 times of approximation, the errors are $E_0$, $E_1$ and $E_2$, respectively, where $E_0>E_1>E_2$. Yet the value of $E_2$ is still greater than $E$, the approximate termination condition is not met, so the operation continues. In Figure 3(b), after 2 approximations, the errors are $E_0$ and $E_1$, where $E_0>E_1$. The value of $E_1$ is less than $E$, which means that the approximate termination condition is satisfied and the operation can be stopped. Therefore, comparing the schematic diagrams of Figures 3(a) and 3(b), it can be seen that using $E'$ instead of $E$ can improve the efficiency of the approximation iteration and prevent endless iteration.

2.2.5 Proposed DAG compression algorithm

Our proposed DAG compression algorithm takes the user's expectation for compression performance as a guide. This guide is then used to adaptively derive an appropriate threshold, and then uses the threshold for ship trajectory compression. The pseudocode of the algorithm is as follows:

DAG compression algorithm

According to the steps of the algorithm, the computational complexity of the algorithm can be estimated; the big-O notation is generally used to estimate the computational complexity (Vaz et al., Reference Vaz, Shah, Sawhney and Deolekar2017). Because complexity of grating algorithm is O$( n)$, the complexity of the DAG compression algorithm is estimated as Equation (8):

(8)\begin{equation} T(n) =m\cdot \text{O} (n) +n+5m+5\approx \text{O} (n), \end{equation}

where $T(n)$ is the complexity of the DAG compression algorithm, $n$ represents the points in the trajectory dataset, $m$ represents the times of iteration for approximation and O represents the order of magnitude.

3. Test and compression results

3.1 Ship trajectories dataset for testing

To test the effectiveness of our proposed DAG compression algorithm and compare the performance with the traditional grating algorithm, a historical AIS messages record of ship trajectories in the Qiongzhou Strait is used as the test dataset. There are 79,934 points in the dataset, as shown in Figure 4.

Figure 4. The uncompressed ships’ trajectories points in the Qiongzhou Strait

These points are tagged with a Maritime Mobile Service Identity (MMSI) number, which is a unique 9-digit number that is assigned to an AIS unit. Because AIS equipment is installed on each ship, the MMSI number is unique to each ship. Then the points are classified by MMSI number and divided into 200 trajectories. The points of each trajectory are connected in time order, the result is shown in Figure 5.

Figure 5. The uncompressed trajectories of ships in the Qiongzhou Strait

In the dataset, different shapes of trajectories are contained to increase the diversity of the sample. One-hundred trajectories with 41,624 points are relatively straight trajectories, running nearly east and west, shown as the light blue lines in Figure 5. Another 100 trajectories with 38,310 points are curve trajectories, turning or tortuous, shown as the blue lines in Figure 5. All of these trajectories are used for the test.

3.2 Compression results and comparison

To obtain the compression results for our proposed DAG algorithm and the traditional grating algorithm, we set the threshold as 0·0138 n miles for the traditional grating compression algorithm, 3,950 compressed points are left out of the 79,934 uncompressed points, so the overall compression ratio of the dataset is approximately 20. The compression result is shown as Figure 6(a), in which the yellow lines represent the relatively straight trajectories and the red lines correspond to the curve trajectories. Accordingly, Figure 6(b) is the compression result of the DAG compression algorithm when the compression ratio is 20, in which the light green lines represent the relatively straight trajectories and the green lines represent the curve trajectories.

Figure 6. The compression results: (a) traditional grating compression; (b) DAG compression

Figure 6 shows that the overall compression results of these two compression algorithms are similar, but a closer look reveals some differences in the compression points. In fact, they are not only different in the distribution of compression points, but also in the threshold(s) as well as in the compression errors. The average compression error of the traditional grating compression algorithm is 0·035 sq. n miles, while the error is 0·029 sq. n miles for the DAG compression algorithm. Additionally, the maximum compression error of the traditional grating compression algorithm is 0·056 sq. n miles, while the error is 0·033 sq. n miles for the DAG compression algorithm. Therefore, when they are of the same compression ratio, the DAG compression algorithm has a lower error than the traditional grating algorithm both in the average error and the maximum compression error.

Because the compression trajectories in Figure 6 are a bit too dense to distinguish the overlaps, 2 curves and 2 straight trajectories are taken out from the dataset to better illustrate the difference of these two algorithms. The compression results are shown in Figure 7.

Figure 7. The compression results: (a) curve trajectories compressed by the traditional compression algorithm; (b) straight trajectories compressed by the traditional compression algorithm; (c) curve trajectories compressed by the DAG algorithm; (d) straight trajectories compressed by DAG algorithm

Figure 7 shows that in the results of the traditional algorithm, the points of the curve trajectories are significantly denser than the points of the straight trajectories. This is mainly because the curve trajectories and the straight trajectories are of the same compression threshold, regardless of the tortuous nature of the trajectories. For the traditional compression algorithm, although the overall compression ratio is 20 and the average compression error 0·035 sq. n miles, the compression ratio of the 2 curve trajectories is 6 with an average error of 0·019 sq. n miles; and the compression ratio of the 2 straight trajectories is 37 with an average error of 0·044 sq. n miles. Hence, the compression statistics of straight trajectories and curves vary considerably.

In contrast, the proposed DAG algorithm considers the compression requirement or expectation (here the compression error is close to 0·03 sq. n miles), and assigns an adaptive threshold to each trajectory. For the curve trajectories, their thresholds are 0·018 and 0·017, respectively; for the straight trajectories, their thresholds are 0·009 and 0·01, respectively. Therefore, the compression ratio of 2 curve trajectories is 19 with an average compression error of 0·027 sq. n miles, and the compression ratio of 2 straight trajectories is 31 with an average compression error of 0·029 sq. n miles. Hence, the difference of straight trajectories and curves is relatively smaller than the traditional compression algorithm.

Figures 6 and 7 are the results of a compression ratio equal to 20. To test the performances in a wider range of compression ratios, more tests were conducted. The results are shown in Figure 8, where the compression ratios are from approximately 1·5 to 50. Every point in the curve of Figure 8 represents a statistic of the compression of an overall 200 trajectories. Because each trajectory has its compression error, to extract the statistical value of error, the average error and maximum error are used respectively for the comparison. Figure 8(a) is the ‘compression ratio–average error’ curve of the proposed DAG compression algorithm and the traditional grating compression algorithm, and Figure 8(b) is ‘compression ratio–maximum error’ curve.

Figure 8. Compression ratio–error curves: (a) average error; (b) maximum error

Figure 8 shows that when the compression ratio is less than 10, the performances of these two algorithms are similar, but when the compression ratio is greater than 10, the proposed DAG algorithm has lower errors than the traditional grating compression algorithm both in the average compression error and the maximum compression error. Additionally, the difference in maximum compression error is much more obvious, which is approximately 0·02 sq. n miles.

The main reason for the improvement in compression error is the adaptive thresholding. Instead of using the same threshold for all the trajectories, our proposed DAG compression algorithm assigns different thresholds to different trajectories. To better illustrate the specific performance of the DAG compression algorithm, 3 compression ratios (10, 20, 40) are selected from Figure 8 to demonstrate the performance of the algorithms. Figure 9 shows the distribution of adaptive thresholds of the trajectories under the 3 compression ratios.

Figure 9. The distribution of adaptive thresholds: (a) overall compression ratio = 10; (b) overall compression ratio = 20; (c) overall compression ratio = 40

Figure 9(a) shows the distribution of adaptive thresholds of the DAG compression algorithm under the compression ratio of 10, where the corresponding fixed threshold for the traditional grating compression algorithm is 0·0072 n miles. From Figure 9(a), it can be seen that the thresholds of relatively straight trajectories are generally less than the thresholds of curve trajectories. For the straight trajectories, the threshold of the peak bar is between 0·005 and 0·0055 n miles, the span of the thresholds is from 0·0035 to 0·0095 n miles; for the curve trajectories, the threshold of the peak bar is between 0·007 and 0·0075 n miles, the span of the thresholds is from 0·004 to 0·0145 n miles; for all trajectories, the threshold of the peak bar is between 0·005 and 0·0055 n miles, and overall span is from 0·0035 to 0·0145 n miles. Therefore, the peak threshold of curve trajectories is greater than the peak threshold of straight trajectories and the thresholds of curve trajectories have a wider span than those of the straight trajectories. Additionally, the fixed threshold of the traditional grating compression algorithm is close to the peak threshold for curve trajectories of the DAG compression algorithm.

Figure 9(b) shows the distribution of adaptive thresholds of the DAG compression algorithm under the compression ratio of 20, where the corresponding fixed threshold for the traditional grating compression algorithm is 0·0138 n miles. For the straight trajectories, the threshold of the peak value is between 0·0096 and 0·0104 n miles, the span of the thresholds is from 0·0064 to 0·0192 n miles; for the curve trajectories, the threshold of the peak value is between 0·012 and 0·0128 n miles, and the span is from 0·0072 to 0·0232 n miles. For all trajectories, the threshold of peak is between 0·014 and 0·0112 n miles and the span is from 0·0064 to ·0232 n miles. Therefore, the peak threshold of curve trajectories is greater and the span of the thresholds of curve trajectories is slightly longer than those of the straight trajectories. The fixed threshold of the traditional grating algorithm is somewhat greater than the peak value of the curve thresholds.

Figure 9(c) shows the distribution of adaptive thresholds of the DAG compression algorithm under the compression ratio of 40, where the corresponding fixed threshold for the traditional grating compression algorithm is 0·0273 n miles. For the straight trajectories, the threshold of the peak value is between 0·021 and 0·0225 n miles and the span is from 0·0015 to 0·0495 n miles; for the curve trajectories, the threshold of the peak value is between 0·0285 and 0·03 n miles, and span is from 0·018 to 0·048 n miles. For all the trajectories, the threshold of peak value is between 0·024 and 0·0255 n miles, and the span is from 0·015 to 0·0495. Therefore, the peak threshold of curve trajectories is greater than the corresponding value of straight trajectories, but the span of thresholds of straight trajectories is longer than that of curve trajectories and equal to the threshold span of all trajectories. The fixed threshold of the traditional grating algorithm is close to the peak value of curve thresholds.

From Figure 9, it can be concluded that the peak value of straight trajectories thresholds is less than the peak value of curve trajectories threshold. The span of straight trajectories thresholds is narrower than the span of curve trajectories when compression is 10, but it becomes longer with the increase of compression ratio. Additionally, the fixed threshold in the traditional grating compression algorithm is close to the peak value of curve trajectories in the DAG compression algorithm when they are of the same overall compression ratio. This means in the traditional grating compression algorithm, the straight trajectories use a greater threshold than the corresponding trajectories of the DAG compression algorithm, which will cause a great error for straight trajectories. However, for some threshold of curve trajectories, the DAG compression algorithm assigns greater thresholds than the traditional grating compression algorithm, which will also cause some greater errors but within the requirement.

The compression errors graphs of every trajectory in the compression are listed in Figure 10.

Figure 10. Compression errors graphs: (a) overall compression ratio = 10; (b) overall compression ratio = 20; (c) overall compression ratio = 40

In each graph of Figure 10, the first 100 labels of trajectories (No. 1~100) correspond to straight trajectories, the second 100 labels (No. 101~200) represent curve trajectories. From all the graphs in Figure 10, it can be seen that the errors of traditional grating compression algorithms are generally greater than the DAG compression algorithms, especially for straight trajectories. This is mainly because the straight trajectories thresholds in the DAG compression algorithms are smaller than those in the traditional algorithms. For curve parts, some errors of the traditional grating compression results are greater than the DAG compression results, and some errors are smaller. This is mainly due to some adaptive thresholds of curve trajectories are less than the fixed threshold while some are greater. For the errors of the traditional grating algorithm, the errors of straight trajectories are greater than the errors of curve trajectories, mainly because both straight trajectories and curve trajectories use the same threshold for compression, and the threshold is somewhat large for the straight trajectories but small for the curve trajectories. However, for the errors of the DAG compression algorithm, errors of straight trajectories are nearly equivalent to the errors of curve trajectories, especially for the maximum values, and they are within the desired error.

To analyse the relationship between compression ratio and compression error of each trajectory in the dataset, the ‘compression ratio–error’ scatter diagrams are shown in Figure 11. Figures 11(a), 11(b), 11(c) correspond to the overall compression ratio equal to 10, 20, 40, respectively.

Figure 11. ‘Compression ratio–error’ scatter diagrams: (a) overall compression ratio = 10; (b) overall compression ratio = 20; (c) overall compression ratio = 40

In Figure 11, each point in the scatter diagram represents a trajectory in the dataset, where each colour of points specifies a specific category of trajectories compressed by an algorithm. It can be seen that the distribution of the green points is more concentrated than the red points, which means that the compression errors of curve trajectories compressed by the DAG algorithm are smaller than the errors of curve trajectories compressed by the traditional grating algorithm. Besides, in Figure 11(a), most of the green points are also more concentrated in abscissa, which means the compression ratios of curve trajectories in the DAG algorithm are close to a specific compression ratio, while the compression ratios of curve trajectories in traditional grating algorithms disperse in a wider range. The concentricity of green points becomes dispersive with the increase of compression ratio, as shown in Figures 11(b) and 11(c), but are still more concentrated than the red points especially in ordinate. Therefore, for the curve trajectories, the compression errors of the DAG algorithm are generally smaller than those of the traditional grating algorithm, and the compression ratio–error relationship is much more clustered in the DAG compression than in the traditional grating compression, which means the compression results of each trajectory in the dataset is more similar in the DAG compression.

The blue points and yellow points in the scatter diagrams of Figure 11 represent the compression results of straight trajectories by the DAG compression algorithm and the traditional compression algorithm, respectively. It can be learned from Figure 11 that the blue points are lower than the yellow points, which means the compression error of the DAG compression algorithm for each trajectory is generally smaller than that of the traditional grating compression algorithm. Most of the blue points in Figure 11(a) are located in a smaller range in abscissa than the yellow points, while this characteristic becomes less obvious as the compression rate increases, as shown in Figures 11(b) and 11(c). This is because in the DAG compression, the straight trajectories use smaller thresholds than the fixed threshold of the traditional grating compression, and then most of the compression ratios of the DAG compression are not as large as those of the traditional grating compression. As the overall compression ratio increases, especially to 40, the dispersions in abscissa of these two algorithms have less difference, but the DAG compression still has smaller error both in average error and maximum error.

4. Conclusion

A DAG compression algorithm was proposed in this paper, with consideration of the expectation or requirement for the compression performance and with an adaptive thresholding module to dynamically obtain the suitable threshold for each trajectory. The DAG algorithm appears to have a better compression than the traditional grating compression algorithm.

This algorithm has a good universality for ship trajectories. It can deal well with the situation where the dataset contains trajectories with different shapes. Through the test on a complex dataset including both relatively straight trajectories and curves of various shapes and directions, our DAG compression dynamically assigns a relatively lower threshold to straight trajectories than the curve trajectories, which makes the compression more balanced and performance more concentrated, rather than the straight trajectory having a greater error while the curve having a much smaller error or the errors of trajectories being randomly dispersed over a wide range as observed with the traditional grating compression algorithm.

The automation feature of the algorithm simplifies the process of operation, saving time and effort for the operator having to repeatedly test to select the threshold. By considering an optimisation strategy in determining the threshold, the complexity of the DAG compression algorithm is maintained as O$( n)$, which does not lead to an obvious increase in the computation complexity. In the future, the application of the algorithm can be further promoted in practice, for example, a practical toolbox can be developed so as to facilitate data visualisation, mapping and data analysis.

Acknowledgment

The authors greatly appreciate the work of data processing and language help done by researchers from Arizona State University and Wuhan University of Technology.

Funding statement

This research was funded by Natural Science Foundation of Hubei Province (grant number 2019CFB339) and China Scholarship Council (CSC No. 202006955031).

References

Abbaspour, R. A., Shaeri, M. and Chehreghan, A. (2017). A method for similarity measurement in spatial trajectories. Spatial Information Research, 25(3), 491–500.CrossRef Google Scholar

Akbari, A., Pelot, R. and Eiselt, H. A. (2018). A modular capacitated multi-objective model for locating maritime search and rescue vessels. Annals of Operations Research, 267(1), 3–28.CrossRef Google Scholar

Alizadeh, D., Alesheikh, A. A. and Sharif, M. (2021). Vessel trajectory prediction using historical automatic identification system data. The Journal of Navigation, 74(1), 156–174.CrossRef Google Scholar

Chen, J., Zheng, T., Garg, A., Xu, L., Li, S. and Fei, Y. (2019a). Alternative maritime power application as a green port strategy: Barriers in China. Journal of Cleaner Production, 213, 825–837.CrossRef Google Scholar

Chen, J., Zhang, W., Wan, Z., Li, S., Huang, T. and Fei, Y. (2019b). Oil spills from global tankers: Status review and future governance. Journal of cleaner production, 227, 20–32.CrossRef Google Scholar

Fiorini, M., Capata, A. and Bloisi, D. D. (2016). AIS data visualization for maritime spatial planning (MSP). International Journal of e-Navigation and Maritime Economy, 5, 45–60.CrossRef Google Scholar

Fournier, M., Hilliard, R. C., Rezaee, S. and Pelot, R. (2018). Past, present, and future of the satellite-based automatic identification system: Areas of applications (2004–2016). WMU Journal of Maritime Affairs, 17(3), 311–345.CrossRef Google Scholar

Goh, T. Y., Basah, S. N., Yazid, H., Safar, M. J. A. and Saad, F. S. A. (2018). Performance analysis of image thresholding: Otsu technique. Measurement, 114, 298–307.CrossRef Google Scholar

Gudmundsson, J., Katajainen, J., Merrick, D., Ong, C. and Wolle, T. (2009). Compressing spatio-temporal trajectories. Computational Geometry, 42(9), 825–841.CrossRef Google Scholar

Harati-Mokhtari, A., Wall, A., Brooks, P. and Wang, J. (2007). Automatic identification system (AIS): Data reliability and human error implications. The Journal of Navigation, 60(3), 373–389.CrossRef Google Scholar

Hörteborn, A., Ringsberg, J., Svanberg, M. and Holm, H. (2019). A revisit of the definition of the ship domain based on AIS analysis. Journal of Navigation, 72(3), 777–794.CrossRef Google Scholar

Hu, B., Jiang, X., de Souza, E. N., Pelot, R. and Matwin, S. (2016). Identifying Fishing Activities from AIS Data with Conditional Random Fields. 2016 Federated Conference on Computer Science and Information Systems. Gdańsk, Poland: IEEE, 47–52.CrossRef Google Scholar

Hua, C., Chen, J., Wan, Z., Xu, L., Bai, Y., Zheng, T. and Fei, Y. (2020). Evaluation and governance of green development practice of port: A sea port case of China. Journal of Cleaner Production, 249, 119434.CrossRef Google Scholar

Ifrim, C., Iuga, I., Pop, F., Wallace, M. and Poulopoulos, V. (2017). Data Reduction Techniques Applied on Automatic Identification System Data. Semantic Keyword-based Search on Structured Data Sources. Gdańsk, Poland: Springer, 14–19.Google Scholar

Ji, H. and Wang, Y. (2010). The Research on the Compression Algorithms for Vector Data. 2010 International Conference on Multimedia Technology. Ningbo, China: IEEE, 1–4.CrossRef Google Scholar

Ji, Y., Xu, W. and Deng, A. (2019). A Study of Vessel Trajectory Compression Based on Vector Data Compression Algorithms. International Conference on Business Information Systems. Seville, Spain: Springer, 473–484.CrossRef Google Scholar

Liu, J., Li, H., Yang, Z., Wu, K., Liu, Y. and Liu, R. W. (2019). Adaptive Douglas-Peucker algorithm with automatic thresholding for AIS-based vessel trajectory compression. EEE Access, 7, 150677–150692.Google Scholar

Liu, J., Liu, Y. and Qi, L. (2021). Modelling liquefied natural gas ship traffic in port based on cellular automaton and multi-agent system. The Journal of Navigation, 74(3), 533–548.CrossRef Google Scholar

Mazzarella, F., Arguedas, V. F. and Vespe, M. (2015). Knowledge-based Vessel Position Prediction using Historical AIS Data. 2015 Sensor Data Fusion: Trends, Solutions, Applications (SDF). Bonn, Germany: IEEE, 1–6.CrossRef Google Scholar

McWhinnie, L. H., O'Hara, P. D., Hilliard, C., Le Baron, N., Smallshaw, L., Pelot, R. and Canessa, R. (2021). Assessing vessel traffic in the Salish Sea using satellite AIS: An important contribution for planning, management and conservation in southern resident killer whale critical habitat. Ocean & Coastal Management, 200, 105479.CrossRef Google Scholar

Milaghardan, A. H., Abbaspour, R. A. and Claramunt, C. (2018). A Dempster-Shafer based approach to the detection of trajectory stop points. Computers, Environment and Urban Systems, 70, 189–196.CrossRef Google Scholar

Moayedi, A., Abbaspour, R. A. and Chehreghan, A. (2019). An evaluation of the efficiency of similarity functions in density-based clustering of spatial trajectories. Annals of GIS, 25(4), 313–327.CrossRef Google Scholar

Muckell, J., Hwang, J. H., Patil, V., Lawson, C. T., Ping, F. and Ravi, S. S. (2011). SQUISH: An Online Approach for GPS Trajectory Compression. The 2nd International Conference on Computing for Geospatial Research & Applications. Washington DC, USA: ACM, 1–8.CrossRef Google Scholar

Muckell, J., Olsen, P. W., Hwang, J. H., Lawson, C. T. and Ravi, S. S. (2014). Compression of trajectory data: A comprehensive evaluation and new approach. GeoInformatica, 18(3), 435–460.CrossRef Google Scholar

Pelot, R., Akbari, A. and Li, L. (2015). Vessel location modeling for maritime search and rescue. International Series in Operations Research & Management Science, 232, 369–402.CrossRef Google Scholar

Popa, I. S., Zeitouni, K., Oria, V. and Kharrat, A. (2015). Spatio-temporal compression of trajectories in road networks. GeoInformatica, 19(1), 117–145.CrossRef Google Scholar

Qi, L. and Ji, Y. (2020). Ship trajectory data compression algorithms for automatic identification system: Comparison and analysis. Journal of Water Resources and Ocean Science, 9(2), 42–47.CrossRef Google Scholar

Qi, L. and Zheng, Z. (2016a). Trajectory prediction of vessels based on data mining and machine learning. Journal of Digital Information Manage, 14(1), 33–40.Google Scholar

Qi, L. and Zheng, Z. (2016b). Vessel trajectory data compression based on course alteration recognition. International Journal of Simulation: Systems, Science and Technology, 17(1), 1–8.Google Scholar

Qi, L. and Zheng, Z. (2016c). A measure of similarity between trajectories of vessels. Journal of Engineering Science and Technology Review, 9(1), 17–22.CrossRef Google Scholar

Qi, L., Zheng, Z. and Li, G. (2011). AIS-data-based ship domain of ships in sight of one another. Journal of Dalian Maritime University, 37(1), 48–50.Google Scholar

Qi, L., Zheng, Z. and Gang, L. (2017). A cellular automaton model for ship traffic flow in waterways. Physica A: Statistical Mechanics and its Applications, 471, 705–717.CrossRef Google Scholar

Qi, L., Ji, Y., Balling, R. and Xu, W. (2021). A cellular automaton-based model of ship traffic flow in busy waterways. The Journal of Navigation, 74(3), 605–618.CrossRef Google Scholar

Rezaee, S., Brooks, M. R. and Pelot, R. (2017). Review of fishing safety policies in Canada with respect to extreme environmental conditions and climate change effects. WMU Journal of Maritime Affairs, 16(1), 1–17.CrossRef Google Scholar

Sánchez-Heres, L. F. (2019). Simplification and event identification for AIS trajectories: The equivalent passage plan method. The Journal of Navigation, 72(2), 307–320.CrossRef Google Scholar

Singh, A. K., Aggarwal, V., Saxena, P. and Prakash, O. (2017). Performance Analysis of Trajectory Compression Algorithms on Marine Surveillance Data. 2017 International Conference on Advances in Computing, Communications and Informatics. Manipal, Mangalore, India: IEEE, 1074–1079.CrossRef Google Scholar

Tichavska, M., Cabrera, F., Tovar, B. and Araña, V. (2015). Use of the Automatic Identification System in Academic Research. International Conference on Computer Aided Systems Theory. Las Palmas de Gran Canaria, Spain: Springer, 33–40.CrossRef Google Scholar

Vaz, R., Shah, V., Sawhney, A. and Deolekar, R. (2017). Automated Big-O Analysis of Algorithms. 2017 International Conference on Nascent Technologies in Engineering. Vashi, India: IEEE, 1–6.CrossRef Google Scholar

Veregin, H. (2000). Quantifying positional error induced by line simplification. International Journal of Geographical Information Science, 14(2), 113–130.CrossRef Google Scholar

Zhang, S. K., Liu, Z. J., Cai, Y., Wu, Z. L. and Shi, G. Y. (2016). AIS trajectories simplification and threshold determination. The Journal of Navigation, 69(4), 729–744.CrossRef Google Scholar

Zhang, Y. Q., Shi, G. Y., Li, S. and Zhang, S. K. (2020). Vessel trajectory online multi-dimensional simplification algorithm. The Journal of Navigation, 73(2), 342–363.CrossRef Google Scholar

Zhao, L. and Shi, G. (2019a). A novel similarity measure for clustering vessel trajectories based on dynamic time warping. The Journal of Navigation, 72(2), 290–306.CrossRef Google Scholar

Zhao, L. and Shi, G. (2019b). A trajectory clustering method based on Douglas-Peucker compression and density for marine traffic pattern recognition. Ocean Engineering, 172, 456–467.CrossRef Google Scholar

Figure 2. Schematic diagram of the sector pattern

Figure 3. Schematic diagram of threshold approximation

Figure 4. The uncompressed ships’ trajectories points in the Qiongzhou Strait

Figure 5. The uncompressed trajectories of ships in the Qiongzhou Strait

Figure 6. The compression results: (a) traditional grating compression; (b) DAG compression

Figure 8. Compression ratio–error curves: (a) average error; (b) maximum error

Figure 9. The distribution of adaptive thresholds: (a) overall compression ratio = 10; (b) overall compression ratio = 20; (c) overall compression ratio = 40

Figure 10. Compression errors graphs: (a) overall compression ratio = 10; (b) overall compression ratio = 20; (c) overall compression ratio = 40

Figure 11. ‘Compression ratio–error’ scatter diagrams: (a) overall compression ratio = 10; (b) overall compression ratio = 20; (c) overall compression ratio = 40

Article contents

A dynamic adaptive grating algorithm for AIS-based ship trajectory compression

Abstract

Keywords

1. Introduction

2. Methodology

2.1 Traditional grating compression algorithm

2.2 Dynamic adaptive grating (DAG) compression algorithm

2.2.1 Definition of threshold

2.2.2 Calculation of the candidate threshold

2.2.3 Evaluation of the compression performance

2.2.4 Performance comparison and approximation

2.2.5 Proposed DAG compression algorithm

3. Test and compression results

3.1 Ship trajectories dataset for testing

3.2 Compression results and comparison

4. Conclusion

Acknowledgment

Funding statement

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests