Abstract Data Types

1 Intro
2 Stacks
- 2.1 Stack API
- 2.2 Implementations
  - 2.2.1 Array Stack
  - 2.2.2 Linked Stack
3 Queues
- 3.1 Queue API
- 3.2 Implementations
  - 3.2.1 Circular Array Queue
  - 3.2.2 Array Queue
4 Lists
- 4.1 List API
- 4.2 Implementations
  - 4.2.1 ArrayList
  - 4.2.2 LinkedList
- 4.3 Ordered lists
- 4.4 Unordered lists
- 4.5 Indexed lists
5 Bibliography

1 Intro

Abstract data types (ADTs are an idealized form of a data structure. They serve to define the operations that one can do with the data structure, without diving into the implementation of it. Abstract data types have a corrolary of a concrete data type. This is also known as a particular implementation of the ADT. Within this paper, I’ve included information about a few different concrete implementations of the ADTs we’ve covered within EOU’s CS240 class. If you’d like a more in depth look at this, check out Steven Skeina’s Algorithm Design Manual.

Concrete implementations of an abstract data type are simply called “data types”. An example might be a singlely linked list as a “data type” for the “list” abstract data type. These differ from data structures in that data structures are an aggregation of data types. We may have examples of data structures which use both lists and stacks in their internal representation.

2 Stacks

A stack is a data structure that holds a collection of data. Items can be put on top of the stack and retrieved from the top of the stack. It’s a “Last In, First Out (LIFO)” data structure. Conceptually, it’s like a stack of plates in your cupboard. When you put plates away, you add them onto the top. When you take out your next plate, you take out the top-most plate.

2.1 Stack API

A stack has 4 key methods.

push(element) puts an element onto the top of the stack.
pop() returns the top-most element from the stack.
top() is the same as pop, but doesn’t actually remove the item from the stack.
isEmpty() says if there are any elements on the stack.

2.2 Implementations

2.2.1 Array Stack

An array stack keeps an internal representation of an array along with a pointer as for what the next item of the stack is. Adding and removing items is simply moving a pointer around, which is fantastic O(1) time complexity. As we add items to the maximum size of the internal array, we do have to grow the array. This does mean there is wasted space within the array, waiting to be used.

This would be a desirable implementation if the number of items being held is relatively consistent, so we’re not building up a very large internal array that then is rarely used.

A poor use for this, by extension, would be a workload that is very spikey in the beginning (e.g. pushing 1k items on the stack), but after they’re cleared, only using 1-2 entries on the stack. This would mean that there are 900+ entries in the internal queue which aren’t being used.

2.2.2 Linked Stack

A linked stack uses a doublely linked list to build a stack. It’s more space efficient than ArrayStack in that it takes space depending on the nodes you have and no more. It’s APIs are also O(1), though if you want to calculate the size, it’s O(n) as you have to traverse the linked list structure versus the ArrayStack’s “what’s the offset of my pointer?” operation.

This is a great use for the spikey workload mentioned above. If we add 1k items on the stack, then step down into normal usage of 1-2 items at a time, we don’t consume space that we aren’t using.

3 Queues

A queue is an ordered collection where things are added to the end and pulled from the front. It’s most similar to a line at the bank. People join the line at the back, but the next customer comes from the front of the line. This is a “First In, First Out (FIFO)” data structure.

3.1 Queue API

A queue has 4 important methods:

enter(element) puts something on the end of the queue.
leave() returns the front-most element.
front() is the same as leave, but doesn’t remove the item from the queue
isEmpty() says if the queue is empty.

3.2 Implementations

3.2.1 Circular Array Queue

A circular array queue uses an array to store the entries of the queue. It moves things off of the queue by moving pointers around. To ensure that we don’t have to resize as often, if we hit the end of the internal array, we wrap around and start writing on the front of it. If at any point we would overwrite data, we resize the array instead and copy the relevant data to a new array.

This potential resizing means that sometimes the enter(element) method may require O(n) time as it iterates through the internal array and copies it into a new array.

Because taking from the queue and adding to a non-full queue is done in constant time, this seems like a solid implementation if the size of our queue stays relatively consistent. Otherwise, we may create a large internal array which is poorly utilized.

3.2.2 Array Queue

An array queue is similar to a circular array queue, but it doesn’t wrap around to the beginning to begin writing data when the array reaches the end. This means we’re left with either regularly reshuffling the array to compact it back down to the start (an O(n) operation), or we have to consistently expand the size of the array as data sizes grow (a space-inefficient O(n) operation).

This data structure looks strictly less efficient than the circular array variant. The one benefit it has is that there is no modular arithmatic which may make it easier to implement.

4 Lists

Lists are much like a todo list someone may have for things that need done around the house. Sometimes, they’re made with an order in mind (e.g. priority), othertimes they aren’t. Sometimes they are numbered, other times not.

More formally, a list is a data type which allows access to a series of elements. The ordering of a list is stable, which is to say if you look through the list twice, it has the same order. Unlike stacks and queues, there are no restrictions about where you can read/alter elements in a list.

Lists are broken down into four categories or “flavors”: ordered vs unordered and indexed vs unindexed. Unindexed isn’t particularly interesting, so we won’t talk more about that specifically.

4.1 List API

There are several methods on a list which are key:

add(element) adds an element to the end of the list.
add(index, element) adds an element to the specified inded
get(index) returns the element at the specified index
remove(index) removes an item by it’s index
remove(element) removes an element by it’s identity
set(index, element) overwrites an element with a new element
size() returns how many things are in the list

4.2 Implementations

4.2.1 ArrayList

An arraylist satisfies the list api, but does so backed with an array. An array is logically a big block of contiguous memory, which is useful for performance. Memory locality aides in caching structures, such that sequential access should be faster . Because we can index directly into the chunk of memory, get(index) should happen in constant time (i.e. O(1)).

If your expected use-case has mostly item appending and not lots of inserting into the middle of the array, this would be a good choice for an implementation. Otherwise, when you insert into the middle of the array, you’ll need to move each element after it one index over. This same thing happens when removing elements. These are O(n) time complexity, because you’ll have to move each of the elements after it (approximately n of them).

There are some downsides, especially in languages like C, wherein you can have overruns of your array if you’re calculating offsets incorrectly.

4.2.2 LinkedList

A linked list has a much simpler implementation around inserting items at arbitrary points. It doesn’t need to copy all of the values after the changed index. Instead, it just shuffles pointers around so things are in the correct order. This especially pays off as the size of the items increases.

From an implementation perspective, a linked list is a chain of objects where each has a pointer to the next item in the list, until you get to the last one which has null as the value. This means that index based access is slow. Adding an element to the end is an O(n) operation. Adding something to the front of the list, however, can be done in constant time.

4.3 Ordered lists

An ordered list is a list that maintains a stable sort as new items are added into it. Where an item is placed in the list depends on the value of the element being inserted. An example of this might be a list of names that are sorted alphabetically.

An ordered list is preferrable if we insert less frequently than we need to read out sorted data. If we didn’t maintain the list in a sorted order, each time we needed to look through it, we’d have to sort it. Additionally, there are algorithms that can take advantage of the sorted nature to accomplish things more quickly, such as finding things. It’s much easier to find a name in a sorted list than it is an unsorted one.

Insertion happens by comparing items through the list until it gets to the right spot and puts it there. Which list implementation you choose has different tradeoffs. In an array structure, you can quickly do a binary search through the attributes to find where to add the item. Unfortunately, when you insert it.. you’ll need to copy all of the other values over by one index, O(n log n). In a linked list implementation, we have to iterate sequentially through the list, but insertion is constant time, so a total of O(n) (Miller & Ranum, section 4.23.1).

4.4 Unordered lists

An unordered list maintains an order based on where items were added into the list, regardless of the values themselves. An example of this might be a brainstorming session where we’re just coming up with a list of ideas. There’s no implied ordering to them, just a simple list. Unlike an ordered list, the user can control where elements are inserted to or deleted from. This implementation should be much better for the ArrayList case, because the average case likely won’t have us inserting things into the center of the array.

4.5 Indexed lists

Elements in an indexed list have a numeric position. There is no inherent relationship among the different elements, like an ordered list. If the contents of the list were to change, the indexes themselves may also change.

This is great to implement with ArrayLists, because they offer fast access to specific indexes using pointer math.

5 Bibliography

The Algorithm Design Manual, Steven Skeina, 2009
Problem Solving with Algorithms and Data Structures using Python, Brad Miller and David Ranum, 2011

The notes of Justin Abrahms

Recently updated

Twyman's Law

Trustworthy Online Controlled Experiments

Threats to internal validity in experiments

Explorer