ZIB

1

Electronic Resource

Using extended feature objects for partial similarity retrieval (1997)

Berchtold, Stefan ; Keim, Daniel A. ; Kriegel, Hans-Peter

Springer

The VLDB journal 6 (1997), S. 333-348

add to mindlist on the mindlist

Details

ISSN: 0949-877X

Keywords: Key words:Indexing and query processing of spatial objects – Partial similarity retrieval – CAD databases – Fourier transformation

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract. In this paper, we introduce the concept of extended feature objects for similarity retrieval. Conventional approaches for similarity search in databases map each object in the database to a point in some high-dimensional feature space and define similarity as some distance measure in this space. For many similarity search problems, this feature-based approach is not sufficient. When retrieving partially similar polygons, for example, the search cannot be restricted to edge sequences, since similar polygon sections may start and end anywhere on the edges of the polygons. In general, inherently continuous problems such as the partial similarity search cannot be solved by using point objects in feature space. In our solution, we therefore introduce extended feature objects consisting of an infinite set of feature points. For an efficient storage and retrieval of the extended feature objects, we determine the minimal bounding boxes of the feature objects in multidimensional space and store these boxes using a spatial access structure. In our concrete polygon problem, sets of polygon sections are mapped to 2D feature objects in high-dimensional space which are then approximated by minimal bounding boxes and stored in an R $^*$ -tree. The selectivity of the index is improved by using an adaptive decomposition of very large feature objects and a dynamic joining of small feature objects. For the polygon problem, translation, rotation, and scaling invariance is achieved by using the Fourier-transformed curvature of the normalized polygon sections. In contrast to vertex-based algorithms, our algorithm guarantees that no false dismissals may occur and additionally provides fast search times for realistic database sizes. We evaluate our method using real polygon data of a supplier for the car manufacturing industry.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/s007780050049

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

2

Electronic Resource

Bioinformatik (1999)

Backofen, Rolf ; Bry, François ; Clote, Peter ; [et al.]

Springer

Informatik-Spektrum 22 (1999), S. 376-378

add to mindlist on the mindlist

Details

ISSN: 1432-122X

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/s002870050166

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

3

Electronic Resource

Multidimensional Index Structures in Relational Databases (2000)

Böhm, Christian ; Berchtold, Stefan ; Kriegel, Hans-Peter ; [et al.]

Springer

Journal of intelligent information systems 15 (2000), S. 51-70

add to mindlist on the mindlist

Details

ISSN: 1573-7675

Keywords: multidimensional index ; relational database ; similarity search ; range query

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Efficient query processing is one of the basic needs for data mining algorithms. Clustering algorithms, association rule mining algorithms and OLAP tools all rely on efficient query processors being able to deal with high-dimensional data. Inside such a query processor, multidimensional index structures are used as a basic technique. As the implementation of such an index structure is a difficult and time-consuming task, we propose a new approach to implement an index structure on top of a commercial relational database system. In particular, we map the index structure to a relational database design and simulate the behavior of the index structure using triggers and stored procedures. This can be easily done for a very large class of multidimensional index structures. To demonstrate the feasibility and efficiency, we implemented an X-tree on top of Oracle8. We ran several experiments on large databases and recorded a performance improvement up to a factor of 11.5 compared to a sequential scan of the database.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1008729828172

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

4

Electronic Resource

Approximation-Based Similarity Search for 3-D Surface Segments (1998)

Kriegel, Hans-Peter ; Seidl, Thomas

Springer

Geoinformatica 2 (1998), S. 113-147

add to mindlist on the mindlist

Details

ISSN: 1573-7624

Keywords: approximation-based similarity search ; multi-step similarity query processing ; ellipsoid queries on multidimensional index structures ; 3-D spatial database systems

Source: Springer Online Journal Archives 1860-2000

Topics: Geography

Notes: Abstract The issue of finding similar 3-D surface segments arises in many recent applications of spatial database systems, such as molecular biology, medical imaging, CAD, and geographic information systems. Surface segments being similar in shape to a given query segment are to be retrieved from the database. The two main questions are how to define shape similarity and how to efficiently execute similarity search queries. We propose a new similarity model based on shape approximation by multi-parametric surface functions that are adaptable to specific application domains. We then define shape similarity of two 3-D surface segments in terms of their mutual approximation errors. Applying the multi-step query processing paradigm, we propose algorithms to efficiently support complex similarity search queries in large spatial databases. A new query type, called the ellipsoid query, is utilized in the filter step. Ellipsoid queries, being specified by quadratic forms, represent a general concept for similarity search. Our major contribution is the introduction of efficient algorithms to perform ellipsoid queries on multidimensional index structures. Experimental results on a large 3-D protein database containing 94,000 surface segments demonstrate the successful application and the high performance of our method.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1009760031965

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

5

Electronic Resource

Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications (1998)

Sander, Jörg ; Ester, Martin ; Kriegel, Hans-Peter ; [et al.]

Springer

Data mining and knowledge discovery 2 (1998), S. 169-194

add to mindlist on the mindlist

Details

ISSN: 1573-756X

Keywords: clustering algorithms ; spatial databases ; efficiency ; applications

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we generalize this algorithm in two important directions. The generalized algorithm—called GDBSCAN—can cluster point objects as well as spatially extended objects according to both, their spatial and their nonspatial attributes. In addition, four applications using 2D points (astronomy), 3D points (biology), 5D points (earth science) and 2D polygons (geography) are presented, demonstrating the applicability of GDBSCAN to real-world problems.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1009745219419

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

6

Electronic Resource

A Fast Parallel Clustering Algorithm for Large Spatial Databases (1999)

Xu, Xiaowei ; Jäger, Jochen ; Kriegel, Hans-Peter

Springer

Data mining and knowledge discovery 3 (1999), S. 263-290

add to mindlist on the mindlist

Details

ISSN: 1573-756X

Keywords: clustering algorithms ; parallel algorithms ; distributed algorithms ; scalable data mining ; distributed index structures ; spatial databases

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we present PDBSCAN, a parallel version of this algorithm. We use the ‘shared-nothing’ architecture with multiple computers interconnected through a network. A fundamental component of a shared-nothing system is its distributed data structure. We introduce the dR*-tree, a distributed spatial index structure in which the data is spread among multiple computers and the indexes of the data are replicated on every computer. We implemented our method using a number of workstations connected via Ethernet (10 Mbit). A performance evaluation shows that PDBSCAN offers nearly linear speedup and has excellent scaleup and sizeup behavior.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1009884809343

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

7

Electronic Resource

Spatial Data Mining: Database Primitives, Algorithms and Efficient DBMS Support (2000)

Ester, Martin ; Frommelt, Alexander ; Kriegel, Hans-Peter ; [et al.]

Springer

Data mining and knowledge discovery 4 (2000), S. 193-216

add to mindlist on the mindlist

Details

ISSN: 1573-756X

Keywords: mining spatial data ; database primitives for KDD

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Spatial data mining algorithms heavily depend on the efficient processing of neighborhood relations since the neighbors of many objects have to be investigated in a single run of a typical algorithm. Therefore, providing general concepts for neighborhood relations as well as an efficient implementation of these concepts will allow a tight integration of spatial data mining algorithms with a spatial database management system. This will speed up both, the development and the execution of spatial data mining algorithms. In this paper, we define neighborhood graphs and paths and a small set of database primitives for their manipulation. We show that typical spatial data mining algorithms are well supported by the proposed basic operations. For finding significant spatial patterns, only certain classes of paths “leading away” from a starting object are relevant. We discuss filters allowing only such neighborhood paths which will significantly reduce the search space for spatial data mining algorithms. Furthermore, we introduce neighborhood indices to speed up the processing of our database primitives. We implemented the database primitives on top of a commercial spatial database management system. The effectiveness and efficiency of the proposed approach was evaluated by using an analytical cost model and an extensive experimental study on a geographic database.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1009843930701

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext