Java Data Structures

There are a number of commonly used Java data structures. Here are links to additional documentation and highlights of structural aspects: 

  • Array: fixed size, best for primitives 
  • List: fixed size, interface, ordered, allows duplicate
  • ArrayList: resizable , add to top, preferred unless need the capabilities of another structure
  • Vector: resizable, synchronized, add to top
  • Stack: LIFO, push/pop
  • Queue: FIFO, add to bottom/remove from top, interface
  • LinkedList: doubly-linked, push/pop, many other methods
  • Set: no duplicates, interface
  • HashMap: random access
  • TreeMap: sorted, efficient search
  • Binary Tree: each node has <=2 children
  • Binary Search Tree: sorted, fast lookup/add/remove
  • k-d Tree: k dimensional, space-partitioning, nearest neighbor search
  • Red-Black Tree: self-balancing, binary search tree, extra bit per node
  • Iterator: move forward and backward through data
  • Collections: perform operations on collections of data
  • Working with Java Lists: methods, iterators, collections
  • Data Structure Decisions: change, access, indexing, sizing
  • Common Java Array Manipulation Constructs: split, sort, binary search, iteration
  • Common Java ArrayList Manipulation Constructs: add, indexOf, remove, sort, binary search, interator
  • B-tree and Binary Search Tree Data Structures: like TreeMap
  • Sorting Algorithms: Collections sort, exchange, hybrid, insertion, merge
  • Big O Notation: O(1), O(log (n)), O(n), O(n log(n))
  • Update and Access Overview: The diagram below indicates how elements and values are updated and accessed. This is a broad overview of commonly used data structure classes, interfaces and objects ... please see the links above and below for more details. Some general observations are:
    • Arrays are different than the other structures in that they are accessed using bracketed index [n] references, such as myStringArray[3] instead of methods and can only contain primitives, not objects.
    • ArrayLists are generally preferred over Lists as they allow resizing. It's a very commonly used structure.
    • Vectors are much like ArrayLists with the additional characteristic that they are synchronized.
    • Queues and Stacks are specialized for LIFO and FIFO access.
    • LinkedLists are powerful structures that require more space for storage and time to execute some operations but are efficient for inserting and deleting objects in the existing order.
    • Sets are simple structures that don't allow duplicates.
    • HashMaps are commonly used when rapid, direct access to elements is the primary requirement.
    • TreeMaps are preferred when rapid search is the primary requirement.
    • Iterators and Collections operate on whole data structures.
    • ( ) empty parameter methods generally operate on the ends of a structure.
    • (element | object) only parameter methods generally operate on sets or the ends of structures.
    • (index | key) and (index | key, element | object) parameter methods generally operate on specific places within a structure.

Working with Java Lists

As shown in the diagram below, there are three major Java class groups used for working with lists: Lists (such as List, ArrayList, LinkedList, AbstractList, Stack and Vector), Iterators (Iterator and ListIterator) and Collections. Together, these classes and their methods provide a wide spectrum options for list construction, examination and manipulation. For an example of using these classes, look here.

Common Java ArrayList Manipulation Constructs

ArrayList data structures (also called Dynamic Arrays) are a frequently used programming function. ArrayLists have an advantage over arrays - ArrayLists are not fixed in size and can be dynamically changed by adding and deleting elements. Below are some common Java array manipulation constructs. For an overview of working with Java lists, look here.

Creating a new ArrayList

Creating a new ArrayList is coded as shown below using the String type with an initial capacity of 20 entries:

ArrayList mArrayList = new ArrayList<String>(int 20);

Constructing an ArrayList from an Existing List

The example below shows how to construct an ArrayList from the individual words in a String using an intermediary List object. Notice that the sequence of objects created is: String -> String array -> List -> ArrayList

String wordString = "this is a string of words";
String[] wordsArray = wordString.split(" ");
List<String> wordsList = Arrays.asList(wordsArray);
ArrayList wordsArrayList = new ArrayList<String>(wordsList);

add( )

The example below shows how to use the ArrayList add() method to add a String to wordsArrayList::

wordsArrayList.add("Example2");

indexOf( )

The example below shows how to use the indexOf() ArrayList method find the index of an entry:

int index = wordsArrayList.indexOf("Example2");

remove( )

The example below shows how to use the ArrayList remove() method using the entry index or the entry itself:

wordsArrayList.remove(index);
wordsArrayList.remove("Example2");

sort( )

The Collections sort() methods provide a variety of options for sorting ArrayList entries. Note that the ArrayList is sorted in place without returning a new ArrayList. This can be done because the array parameter is a value pointing to the ArrayList of objects. For example, the following statement would sort the words ArrayList from the example above using a modified MergeSort algorithm in an average O(n log(n)) time performance:

Collections.sort(wordsArrayList);

binarySearch( )

The Collections binarySearch() methods provide an efficient way to find an entry in a sorted ArrayList. For example, the following statement would find the index of the wordsArrayList entry for the String Example1 in an average of O(log(n)) time performance:

int entryIndex = Collections.binarySearch(wordsArrayList, "Example1");

iterator( )

The Iterator Class and ArrayList interator() method can be used to iterate over the entries of an ArrayList as shown in the example below :

Iterator mIterator = wordsArrayList.iterator();
while(mIterator.hasNext()){
System.out.println(mIterator.next());
};

Common Java Array Manipulation Constructs

Working with array data structures is one of the most frequently used programming functions. Arrays are implemented in a number of Java classes, such as String, Integer, and Boolean. Those arrays can be manipulated using the Java Arrays class (note that the Arrays class name is a plural). Below are some common Java array manipulation constructs.

split( )

The String split() method separates string characters into array entries using a supplied regular expression. For example, the following statement would create entries in the String array wordsArray using a blank character as a regular expression separator for processing the String wordString:

String wordString = "this is a string of words";
String[] wordsArray = wordString.split(" ");

sort( )

The Arrays sort() methods provide a variety of options for sorting array entries. Note that the array is sorted in place without returning a new array. This can be done because the array parameter is a value pointing to the array of objects. For example, the following statement would sort the words array from the split() example above using a Quicksort process in an average O(n log(n)) time performance:

Arrays.sort(wordsArray);

binarySearch( )

The Arrays binarySearch() methods provide an efficient way to find an entry in a sorted array. For example, the following statement would find the index of the array wordsArray entry for the String entryValue in an average of O(log(n)) time performance:

int entryIndex = Arrays.binarySearch(wordsArray, entryValue);

for( )

The for() construct can be used to iterate over the entries in an array. For example, the following statement would iterate over and print every entry in wordsArray:

for (String word : wordsArray) {System.out.print(word);}

If you want to know and use the index of each entry, then this construct can be used:

for (int i=0; i<wordsArray.length; i++) {System.out.print(wordsArray(i);}

Java vs. Ruby Programming Language Comparison

Java and Ruby are two popular programming languages that represent two very different approaches to structure and syntax. Java has a more formal, strict syntax. Ruby syntax is more forgiving and provides many shortcuts and abbreviations. Java is compiled and Ruby is interpreted.

The table below shows a sampling of Java and Ruby programming language elements that highlight these differences.

JavaRuby
Block {
   . . .
}
def/class/do/if...
   . . .
end
Class
Definition
public class Hello {
   . . .
}
class Hello
   . . .
end
Class
Constructor
public Hello(Strng msg) {
    defaultMsg=msg;
}
def initialize(msg)
   . . .
end
Class
Instantiation
Hello mHello=new Hello();
mHello=Hello.new("Hello Ruby")
Class
Variable
static String myVariable; @@myVariable
Comment // Comment text. # Comment text.
Constant final static MY_CONSTANT; MY_CONSTANT
Conditional if (a>b) {
   . . .
}  else {
   . . .
}
if a>b
   . . .
else
   . . .
end
Exception
Handling
try { . . . }
catch { . . . }
raise . . .
 
Global
Variable
public class MyGlobals {
   public static String mG1
}
$mG1
 
 
Importing import java.sql.*; require 'extensions'
Inheritance extends MyClass < MyClass
Instance
Variable
// Declared outside methods
String mVariable;
@mVariable
 
Iteration for (i=0; i<5; i++) {
    . . .
}

for (myElement : myCollection){
    . . .
}
5.times do | i |
   . . .
end
Local
Variable
String myVariable; my_variable
Method
Definition
public void myIdea(String param) {
   . . .
}
def myIdea(param="Starter")
   . . .
end
Method
Use
myIdea(mParameter) myIdea m_parameter
Namespace
Reference
packageName;
: :
Null Value null
nill
Statement
Separator
;
new line
Variable String mString="Hello Java";
mString="Hello Ruby"

Java abstract Classes and Methods

The abstract modifier for classes and methods has to do with instantiation. Here's a summary:

abstract class

  • can't be instantiated
  • can be extended
  • must be declared abstract if it contains one or more abstract methods

abstract method

  • has only a declaration with no body, such as:
public abstract myMethod(String myString);

Java final, abstract and static Modifiers

The meaning of the final, abstract and static Java modifiers can be confusing and hard to remember. Here's a brief summary:

final

Generally means "can't be changed".

  • final class: can't be extended (sub-classed)
  • final method: can't be overridden or hidden by sub-classes
  • final variable: can only be initialized once 

abstract

Generally means "can't be instantiated".

  • abstract class: can't be instantiated, can be extended, must be declared abstract if it contains one or more abstract methods
  • abstract method: has only a declaration with no body
  • abstract variable: not allowed

static

Generally means "not associated with an instance".

  • static class: only nested inner classes can be static, they are not associated with any instance of the enclosing class
  • static method: doesn't use instance variables
  • static variable: is the same across all instances of it's containing class

additional notes

  • constants: use static final as variable modifiers
  • abstract static methods are not allowed

Sorting Algorithms

Before delving into the large variety of sorting algorithms, it's important to understand that there are simple ways to perform sorts in most programming languages. For example, in Java, the Collections class contains a sort method that can be implemented as simply as:

Collections.sort(list);

where list is an object such as an ArrayList. Some aspects of the Java sort() method to note are:

  • You can also use the Java Comparator class methods to implement your own list item comparison functions for specialized sorting order needs.
  • The Java Collections class (plural) is different than the Collection class (singular).
  • Which sorting algorithm (see below) is used in the Collections.sort() implementation depends on the implementation approach chosen by the Java language developers. You can find implementation notes for Java Collections.sort() here. It currently uses a modified merge sort that performs in the range of Big O O(n log(n)).

If you don't want to use a built-in sort function and you're going to implement your own sort function, there's a large list of sorting algorithms to choose from.  Factors to consider in choosing a sort algorithm include:

  • Speed of the algorithm for the best, average and worst sort times. Most algorithms have sort times characterized by Big O functions of O(n), O(n log(n)) or O(n^2).
  • The relative importance of the best, average and worst sort times for the sort application.
  • Memory required to perform the sort.
  • Type of data to be sorted (e.g., numbers, strings, documents).
  • The size of the data set to be sorted.
  • The need for sort stability (preserving the original order of duplicate items).
  • Distribution and uniformity of values to be sorted.
  • Complexity of the sort algorithm.

For a comparison of sorting algorithms based on these and other values see: https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms.

Here's a quick reference for the major algorithms:

  • Exchange Sorts: based on swapping items
    • Bubble sort: for each pair of indices, swap the items if out of order, loop for items and list.
    • Cocktail sort: variation of bubble sort, passes alternately from top to bottom and bottom to top.
    • Comb sort: variation of bubble sort, selective swap of values
    • Gnome sort: also called the stupd sort, moves values back to just above a value less than it
    • Odd-even sort: developed originally for use with parallel processors, examines odd-even pairs and orders them, alternates pairs until list is ordered
    • Quicksort: divide list into two, with all items on the first list coming before all items on the second list.; then sort the two lists. Repeat. Often the method of choice. One of the fastest sort algorithms
  • Hybrid Sorts: mixture of sort techniques
    • Flashsort: Used on data sets with a known distribution, estimates used for where an element should be placed
    • Introsort: begin with quicksort and switch to heapsort when the recursion depth exceeds a certain level
    • Timsort: adaptative algorithm derived from merge sort and insertion sort. 
  • Insertion sorts: builds the final sorted array one item at a time
    • Insertion sort: determine where the current item belongs in the list of sorted ones, and insert it there
    • Library sort: like library shelves, space is created for new entries within groups such as first letters, space is removed at end of sort
    • Patience sorting: based on the solitaire card game, uses piles of "cards"
    • Shell sort: an attempt to improve insertion sort
    • Tree sort (binary tree sort): build binary tree, then traverse it to create sorted list
    • Cycle sort: in-place with theoretically optimal number of writes
  • Merge sortstakes advantage of the ease of merging already sorted lists into a new sorted list
    • Merge sort: sort the first and second half of the list separately, then merge the sorted lists
    • Strand sort: repeatedly pulls sorted sublists out of the list to be sorted and merges them with the result array
  • Non-comparison sorts
    • Bead sort: can only be used on positive integers, performed using mechanics like beads on an abacus
    • Bucket sort: works by partitioning an array into a number of buckets, each bucket is then sorted individually using the best technique, the buckets are then merged
    • Burstsort: used for sorting strings, employs growable arrays
    • Counting sort: sorts a collection of objects according to keys that are small integers
    • Pigeonhole sort: suitable for sorting lists of elements where the number of elements and number of possible key values are approximately the same, uses auxiliary arrays for grouping values
    • Postman sort: variant of Bucket sort which takes advantage of hierarchical structure
    • Radix sort: sorts strings letter by letter
  • Selection sorts: in place comparison sorts
    • Heapsort: convert the list into a heap, keep removing the largest element from the heap and adding it to the end of the list
    • Selection sort: pick the smallest of the remaining elements, add it to the end of the sorted list
    • Smoothsort
  • Other
  • Unknown class