Quick links:

Interval data

Description

The interval overlap join is among the most widely used operators in temporal databases. Essentially, temporal databases store relations of explicit attributes that conform to a schema while data objects carry a validity interval. In this context, an interval overlap join identifies object pairs (or combinations) with overlapping validity. Besides temporal databases, interval joins also find application in other domains; in multi-dimensional spaces where an object can be represented as a set of intervals in a space-filling curve or in uncertain data management where uncertain values are represented as intervals (which can be paired with confidence values).

In this context, plane sweep is investigated as a competitive approach for interval joins; in brief, we proposed an optimized version of plane sweep which has the potential of performing fewer comparisons than the actual join output size. In addition, we devised a novel orthogonal parallel paradigm to benefit from modern hardware. In our follow up works, we studied temporal aggregation based on interval joins by introducing a semi-join operation and the band-join variant of the query.

Publications

Panagiotis Bouros, George Christodoulou, Christian Rauch, Artur Titkov and Nikos Mamoulis:
Querying Interval Data on Steroids
To appear in IEEE Transactions on Knowledge and Data Engineering (TKDE)

Christian Rauch and Panagiotis Bouros:
Fast Indexing for Temporal Information Retrieval
Proceedings of the ACM on Management of Data (PACMMOD), Vol 3, No 4, September 2025
To be presented at the 2026 ACM International Conference on Management of Data (ACM SIGMOD), Bengaluru, India, May 31 - June 5, 2026

Panagiotis Bouros and Nikos Mamoulis:
Relevance Queries for Interval Data
Proceedings of the ACM on Management of Data (PACMMOD), Vol 3, No 3, June 2025
To be presented at the 2025 ACM International Conference on Management of Data (ACM SIGMOD), Berlin, Germany, June 22-27, 2025

Panagiotis Bouros, Artur Titkov, George Christodoulou, Christian Rauch and Nikos Mamoulis:
HINT on Steroids: Batch Query Processing for Interval Data
Proceedings of the 27th International Conference on Extending Database Technology (EDBT), Paestum, Italy, March 25-29, 2024

George Christodoulou, Panagiotis Bouros and Nikos Mamoulis:
LIT: Lightning-fast In-memory Temporal Indexing
Proceedings of the ACM on Management of Data (PACMMOD), Vol 2, No 1, February 2024
Presented at the 2024 ACM International Conference on Management of Data (ACM SIGMOD), Santiago, Chile, June 9-15, 2024

George Christodoulou, Panagiotis Bouros and Nikos Mamoulis:
HINT: A Hierarchical Interval Index for Allen Relationships
International Journal on Very Large Data Bases (VLDBJ), Vol 33, No 1, January 2024

George Christodoulou, Panagiotis Bouros and Nikos Mamoulis:
HINT: A Hierarchical Index for Intervals in Main Memory
Proceedings of the 2022 ACM International Conference on Management of Data (ACM SIGMOD), Philadelphia, PA, USA, June 12-17, 2022

Panagiotis Bouros, Dimitrios Tsitsigkos, Nikos Mamoulis and Manolis Terrovitis:
In-Memory Interval joins
International Journal on Very Large Data Bases (VLDBJ), Vol 30, No 4, July 2021

Panagiotis Bouros, Konstantinos Lampropoulos, Dimitrios Tsitsigkos, Nikos Mamoulis and Manolis Terrovitis:
Band Joins for Interval Data
Proceedings of the 23rd International Conference on Extending Database Technology (EDBT), Copenhagen, Denmark, March 30 - April 2, 2020

Panagiotis Bouros and Nikos Mamoulis:
Interval Count Semi-Joins
Proceeding of the 21st International Conference on Extending Database Technology (EDBT), Vienna, Austria, March 26-29, 2018

Panagiotis Bouros and Nikos Mamoulis:
A Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins
Proceedings of the VLDB Endowment (PVLDB), Vol 10, No 11, July 2017

Theses

Christian Rauch:
Temporal Information Retrieval (working title)
Doctorate thesis, Johannes Gutenberg University Mainz, Germany

Panagiotis Simatis:
Indexing and Querying Complex Data (working title)
Doctorate thesis, University of Ioannina, Greece
Primary supervisor: Prof. Nikos Mamoulis

Bodowin Weber:
Visualizing and benchmarking interval indexing
Bachelor thesis, Johannes Gutenberg University Mainz, Germany

Giorgos Kotsinas:
Ranking Queries over Range Data
Master thesis, University of Ioannina, Greece, 2024
Primary supervisor: Prof. Nikos Mamoulis

Jan Raider:
A Graphical User Interface for Managing Interval Data
Bachelor thesis, Johannes Gutenberg University Mainz, Germany, 2024

George Christodoulou:
Managing Big and Complex Data in Main Memory
Doctorate thesis, University of Ioannina, Greece, 2023

Source code

EDBT'24, GitHub repository for batch HINT, https://github.com/pbour/batch_hint
ACM PACMMOD'24, GitHub repository for LIT , https://github.com/GiorgosChristodoulou/LIT
VLDBJ and ACM SIGMOD'22, GitHub repository for HINT, https://github.com/pbour/hint
PVLDB'17, GitHub repository for interval join, https://github.com/pbour/ijoin

Spatial data

Description

The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. Despite the vast amount of research output on the subject, the complexity of the data and the consideration modern commodity hardware, which supports parallel processing, opens the road to a number of interesting directions for future research. We briefly survey of the background in spatial join computation and outline these directions. In an effort to cover some of modern challenges, we investigates the in-memory and parallel evaluation of spatial joins, by tuning a classic partitioning based algorithm. The study shows that, compared to a straightforward implementation of the algorithm, performance can be significantly improved by properly selecting partitioning parameters based on data statistics, in order to tune the algorithm for the given join inputs. The proposed parallel implementation scales gracefully with the number of threads reducing the cost of the join to at most one second even for join inputs with tens of millions of spatial objects.

Publications

Achilleas Michalopoulos, Dimitrios Tsitsigkos, Panagiotis Bouros, Nikos Mamoulis and Manolis Terrovitis:
Efficient Distance Queries on Non-point Data
ACM Transactions on Spatial Algorithms and Systems (TSAS), Vol 11, No 1, March 2025

Dimitrios Tsitsigkos, Panagiotis Bouros, Konstantinos Lampropoulos, Nikos Mamoulis and Manolis Terrovitis:
Two-layer Space-oriented Partitioning for Non-point Data
IEEE Transactions on Knowledge and Data Engineering (TKDE), Vol 36, No 3, March 2024

Achilleas Michalopoulos, Dimitrios Tsitsigkos, Panagiotis Bouros, Nikos Mamoulis and Manolis Terrovitis:
Efficient Nearest Neighbor Queries on Non-point Data
Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL), Hamburg, Germany, November 13-16, 2023

Dimitrios Tsitsigkos, Konstantinos Lampropoulos, Panagiotis Bouros, Nikos Mamoulis and Manolis Terrovitis:
A Two-layer Partitioning for Non-point Spatial Data
Proceedings of the 37th IEEE International Conference on Data Engineering (ICDE), Chania, Greece, April 19-22, 2021

Dimitrios Tsitsigkos, Konstantinos Lampropoulos, Panagiotis Bouros, Nikos Mamoulis and Manolis Terrovitis:
A Two-level Spatial In-Memory Index
CoRR abs/2005.08600, May 2020

Dimitrios Tsitsigkos, Panagiotis Bouros, Nikos Mamoulis and Manolis Terrovitis:
Parallel In-Memory Evaluation of Spatial Joins
Proceedings of the 27th ACM International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL), Chicago, Illinois, USA, November 5-8, 2019

Panagiotis Bouros and Nikos Mamoulis:
Spatial Joins: What's next?
ACM SIGSPATIAL Special, Vol 11, No 1, March 2019

Thesis

Achilleas Michalopoulos:
Partitioning and Indexing Techniques for Scalable Spatial Query Evaluation
Doctorate thesis, University of Ioannina, Greece

Dimitrios Tsitsigkos:
Join Operators for Complex Data
Doctorate thesis, University of Ioannina, Greece, 2024

Text and set data

Description

Sets are ubiquitous in computer science; they model transactions, scientific data, Web data, text etc. Contemporary data management systems allow the definition of set-valued data attributes and support operations such as containment queries. Join operations are also extended to include predicates on sets such as containment and similarity.

Regarding the former, we proposed (i) an adaptive methodology that alleviates the shortcomings of the state-of-the-art method which indexes the inputs using a prefix tree and an inverted index, and (ii) a novel join paradigm which interleaves indexing with the actual join process. We also studied the parallel computation of set containment joins. Regarding the similarity predicate, we proposed the grouping technique which builds on top of the widely-adopted prefix filtering allowing to batch process objects with identical prefix. Recently, another line of our work focuses on experimentally comparing set similarity join evaluation methods both on centralized and distributed environments.

Publications

Fabian Fier, Nikolaus Augsten, Panagiotis Bouros, Ulf Leser and Johann-Christoph Freytag:
Set Similarity Joins on MapReduce: An Experimental Survey
Proceedings of the VLDB Endowment (PVLDB), Vol 11, No 10, May 2018

Panagiotis Bouros, Nikos Mamoulis, Shen Ge and Manolis Terrovitis:
Set Containment Join Revisited
International Journal on Knowledge and Information Systems (KAIS), Vol 49, No 1, October, 2016

Willi Mann, Nikolaus Augsten and Panagiotis Bouros:
An Empirical Evaluation of Set Similarity Join Techniques
Proceedings of the VLDB Endowment (PVLDB), Vol 9, No 9, May 2016

Anja Kunkel, Astrid Rheinländer, Christopher Schiefer, Sven Helmer, Panagiotis Bouros and Ulf Leser:
PIEJoin: Towards Parallel Set Containment Joins using Prefix Tree Indexes
Proceedings of the 28th International Conference on Scientific and Statistical Database Management (SSDBM), Budapest, Hungary, July 18-20, 2016

Panagiotis Bouros, Shen Ge and Nikos Mamoulis:
Spatio-Textual Similarity Joins
Proceedings of the VLDB Endowment (PVLDB), Vol 6, No 1, November 2012

Manolis Terrovitis, Panagiotis Bouros, Panos Vassiliadis, Timos Sellis and Nikos Mamoulis:
Efficient Answering of Set Containment Queries for Skewed Item Distributions
Proceedings of the 14th International Conference on Extending Database Technology (EDBT), Uppsala, Sweden, March 21-25, 2011

Source code

PVLDB'16, ssjoins http://ssjoin.dbresearch.uni-salzburg.at

Spatio-textual data

Description

We introduced the Spatio-Textual Similarity Join operation (ST-SJOIN) to identify object pairs (or combinations in case of multiple inputs) which are closely located in space and have a similar textual description. This is the first work on joining objects carrying spatial and text information. Intuitively, ST-SJOIN comes as a hybrid query operator which applies a join predicate in two dimensions of the data at the same time, and therefore, it finds application in a wide range of domains. In social network analysis, ST-SJOIN can be used for social recommendations; e.g., to recommend potential friend relationships between people with common interests (modeled as a keyword set), that live or socialize in nearby areas. As another application, ST-SJOIN can improve the effectiveness of data de-duplication, i.e., the task of retrieving near-identical objects; e.g., identifying photos that picture the same object in the photo-sharing services of Flickr or Instagram.

Publications

Artur Titkov and Panagiotis Bouros:
Spatially Combined Keyword Searches
Proceedings of the 25th International Conference on Extending Database Technology (EDBT), Edinburgh, UK, March 29 - April 1, 2022

Panagiotis Bouros, Shen Ge and Nikos Mamoulis:
Spatio-Textual Similarity Joins
Proceedings of the VLDB Endowment (PVLDB), Vol 6, No 1, November 2012

Theses

Yazhou Pan, Bachelor thesis
Processing spatial keyword search queries
Bachelor thesis, Johannes Gutenberg University Mainz, Germany, 2023

Maximilian Detlef Zerbe, Bachelor thesis
Spatio-textual outlier detection
Bachelor thesis, Johannes Gutenberg University Mainz, Germany, 2023

Artur Titkov:
Spatially Combined Text Searches
Bachelor thesis, Johannes Gutenberg University Mainz, Germany, 2020

Shen Ge:
Advanced analysis and join queries in multidimensional spaces
Doctorate thesis, University of Hong Kong, China PR, 2012

Ranking

Description

The amount of join results are often overwhelming; as only a subset would be further examined, a top-k join operator which retrieves the best k results comes of greater practical value. Top-k joins have been extensively studied in the past but only for relational equijoins, i.e., for primitive (numerical) join attributes and the equality predicate.

To fill this gap, we introduced a generalized ranking join operator on complex data types. For example, spatial locations as join attributes can be used to recommend to the visitors of a city the best k pairs of restaurants and hotels within short distance from each other that have the highest combined ratings. Top-k join for string join attributes finds application in tasks like data integration where the score ratings from different data sources are combined, or data cleaning and de-duplication. The evaluation of relational ranking joins is challenged by the cost of accessing the data objects; however, in case of complex data types the computational cost can easily become the bottleneck. In view of this, I proposed a novel evaluation paradigm which accesses objects from the inputs in blocks to reduce the CPU cost but without compromising at the same time, the access or I/O cost.

Publications

Shuyao Qi, Panagiotis Bouros and Nikos Mamoulis:
Top-k Spatial Distance Joins
International Journal on Advances of Computer Science for Geographic Information Systems (GeoInformatica), Vol 24, No 3, July 2020

Shuyao Qi, Panagiotis Bouros and Nikos Mamoulis:
Top-k String Similarity Joins
Proceedings of the 32nd International Conference on Scientific and Statistical Database Management (SSDBM), Vienna, Austria, July 7-9, 2020

Shuyao Qi, Panagiotis Bouros and Nikos Mamoulis:
Efficient top-k Joins on Complex Data Types
TR-2015-03, Department of Computer Science, HKU, Hong Kong SAR, China, December 2015

Shuyao Qi, Panagiotis Bouros and Nikos Mamoulis:
Efficient Top-k Spatial Distance Joins
Proceedings of the 13th International Symposium on Spatial and Temporal Databases (SSTD), Munich, Germany, August 21-23, 2013

Thesis

Shuyao Qi:
Advanced Ranking Queries on Composite Data
Doctorate thesis, University of Hong Kong, China PR, 2016

Scalable Query Processing

Interval data

Description

Publications

Theses

Source code

Spatial data

Description

Publications

Thesis

Text and set data

Description

Publications

Source code

Spatio-textual data

Description

Publications

Theses

Ranking

Description

Publications

Thesis