it.unimi.dsi.mg4j.util

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Search Engine » mg4j » it.unimi.dsi.mg4j.util 
it.unimi.dsi.mg4j.util
MG4J: Managing Gigabytes for Java General-purpose utility classes.

Package Specification

The classes in this package are completely general utility classes that may be useful outside of MG4J. In particular, {@link it.unimi.dsi.mg4j.util.MutableString} is a highly-tuned, versatile class for content-based mutable strings.

Another general-purpose important component is {@link it.unimi.dsi.mg4j.util.MinimalPerfectHash}, which implements state-of-the-art, hypergraph-based algorithms for the construction of ordered minimal perfect hash tables.

If minimal perfect hash tables are too large, a good alternative is an {@link it.unimi.dsi.mg4j.util.ImmutableExternalPrefixDictionary}, which keeps most data in compressed form on disk, and uses a small in-memory index to access those data.

Java Source File NameTypeComment
BitVectorBooleanIterator.javaClass A boolean iterator returning the bits of a bit vector.
BloomFilter.javaClass A Bloom filter.

Instances of this class represent a set of character sequences or primitive-type arrays (with false positives) using a Bloom filter.

CircularCharArrayBuffer.javaClass A circular char buffer, that can be used to implement a sliding window over a text.
Fast.javaClass All-purpose optimised static-method container class.
FlyweightPrototype.javaInterface A prototype providing flyweight copies.

MG4J uses often flyweight copies to implement multithreading on read-only (but maybe stateful) classes.

FlyweightPrototypes.javaClass A class providing static methods and objects that do useful things with .
FrontCodedStringList.javaClass Compact storage of strings using front-coding compression.

This class stores a list of strings using front-coding compression (of course, the compression will be reasonable only if the list is sorted, but you could also use instances of this class just as a handy way to manage a large amount of strings).

HashCodeSignedMinimalPerfectHash.javaClassjava.lang.String.hashCode -signed order-preserving minimal perfect hash tables.
ImmutableBinaryTrie.javaClass An immutable implementation of binary tries.

Instance of this class are built starting from a lexicographically ordered list of cern.colt.bitvector.BitVector s representing binary words.

ImmutableExternalPrefixDictionary.javaClass An immutable prefix dictionary mostly stored in external memory.

A prefix dictionary is a data structure that stores efficiently a lexicographically ordered list of character sequences so that it is possible to obtain, given a character sequence s, its position in the sequence, and also the (possibly empty) interval of the sequence composed by character sequences that are extensions of s.

ImmutableExternalTreePrefixDictionary.javaClass An it.unimi.dsi.mg4j.util.ImmutableExternalPrefixDictionary that compresses words using a it.unimi.dsi.mg4j.compression.HuffmanCodec and approximates intervals using a it.unimi.dsi.mg4j.util.TernaryIntervalSearchTree .
ImmutableExternalTriePrefixDictionary.javaClass An it.unimi.dsi.mg4j.util.ImmutableExternalPrefixDictionary that compresses words using a it.unimi.dsi.mg4j.compression.HuTuckerCodec and approximates intervals using an it.unimi.dsi.mg4j.util.ImmutableTriePrefixTree that uses the same codec.
ImmutableTriePrefixTree.javaClass A class adapter from immutable binary tries to prefix trees.

Instances of this class map transparently strings into a binary trie.

IntBloomFilter.javaClass A Bloom filter for integers.

Instances of this class represent a set of integers (with false positives) using a Bloom filter.

LiterallySignedMinimalPerfectHash.javaClass Literally signed order-preserving minimal perfect hash tables.

This class source exemplifies a signed minimal perfect hash table that signes each term with the term itself, thus avoiding false positives.

MG4JClassParser.javaClass A small wrapper around JSAP's standard ClassStringParser .
MimeTypeResolver.javaClass A thin wrapper around a singleton instance of javax.activation.MimetypesFileTypeMap that tries to load /etc/mime.types into the map.
MinimalPerfectHash.javaClass Order-preserving minimal perfect hash tables.

Given a list of terms without duplicates, the constructors of this class finds an order-preserving minimal perfect hash function for the list.

MutableString.javaClass Fast, compact, optimised & versatile mutable strings.

Motivation

The classical Java string classes, java.lang.String and java.lang.StringBuffer , lie at the extreme of a spectrum (immutable and mutable).

However, large-scale text indexing requires some features that are not provided by these classes: in particular, the possibility of using a mutable string, once frozen, in the same optimised way of an immutable string.

MutableStrings.javaClass A static container providing utility objects and method for it.unimi.dsi.mg4j.util.MutableString s.
PermutedFrontCodedStringList.javaClass A it.unimi.dsi.mg4j.util.FrontCodedStringList whose indices are permuted.

It may happen that a list of strings compresses very well using front coding, but unfortunately alphabetical order is not the right order for the strings in the list.

ProgressLogger.javaClass Tunable progress logger.

This class provides a simple way to log progress information about long-lasting activities.

To use this class, you first create a new instance by passing a , a org.apache.log4j.Priority and a time interval in millisecond (you can use constants such as ProgressLogger.ONE_MINUTE ). Information will be logged about the current state of affairs no more often than the given time interval.

ProgressMeter.javaClass Tunable progress meter.

This class provides a simple way to display progress information about long-lasting activities.

Properties.javaClass An extension of org.apache.commons.configuration.PropertiesConfiguration providing setters for primitive types, a simpler and transparently handling of java.lang.Enum lowercased keys.

All accessors defined in org.apache.commons.configuration.PropertiesConfiguration have a polymorphic counterpart taking an java.lang.Enum instead of a string: java.lang.Enum.name and java.lang.String.toLowerCase are applied before delegating to the corresponding string-based method.

SemiExternalOffsetList.javaClass Provides semi-external random access to offsets of an it.unimi.dsi.mg4j.index.Index index .
ShiftAddXorLongSignedMinimalPerfectHash.javaClass Order-preserving minimal perfect hash tables signed using Shift-Add-Xor long hashes.
ShiftAddXorSignedMinimalPerfectHash.javaClass Order-preserving minimal perfect hash tables signed using Shift-Add-Xor hashes.
SignedMinimalPerfectHash.javaClass Signed order-preserving minimal perfect hash tables.

Minimal perfect hash tables will always return a result, even for terms that were not present in the collection indexed by the table.

TernaryIntervalSearchTree.javaClass Ternary interval search trees.

Ternary search trees are a data structure used to store words over an alphabet; they are a useful alternatives to tries when the alphabet is large.

Ternary interval search trees have the additional properties of being able to locate quickly intervals of words extending a given prefix (where “quickly” means that no more successful character comparisons than the prefix length are performed).

TextPattern.javaClass QuickSearch matching against a constant string.

The of the Java API are a powerful tool; however, when searching for a constant pattern many algorithms can increase of orders magnitude the speed of a search.

This class provides constant-pattern text search facilities by implementing Sunday's QuickSearch (a simplified but very effective variant of the Boyer—Moore search algorithm) using compact approximators, a randomised data structure that can accomodate in a small space (but in an approximated way) the bad-character shift table of a large alphabet such as Unicode.

Since a large subset of US-ASCII is used in all languages (e.g., whitespace, punctuation, etc.), this class caches separately the shifts for the first 128 Unicode characters, resulting in very good performance even on text in pure US-ASCII.

Note that the MutableString.indexOf(MutableString) indexOf methods of MutableString use a even more simplified variant of QuickSearch which is less efficient, but has a smaller setup time and does not generate any object.

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.