Sequences
In the previous section, we saw how filtering and mapping operations can be chained together:
numbers.filter { it % 2 != 0 }.map { it * it }
In the example above, the initial filtering operation generates a temporary collection, which is then transformed by the subsequent mapping operation. After that point, the temporary collection is no longer required, and the memory it uses it will be reclaimed.
This isn’t much of a problem for the small examples that we’ve considered so far, but there could be significant performance implications to the creation of these intermediate collections if we are working with longer chains of operations, or if we are working with very large amounts of data (e.g., lists with millions of elements).
To cater for these scenarios, Kotlin provides the Sequence type1.
A Sequence object represents a sequence of values, over which we can
iterate. A Sequence object knows how to retrieve the next value in this
sequence of values, without actually having to store all of the values
from the sequence itself. This means that a potentially infinite series of
values can be represented as a Sequence.
Because sequences are iterable, operations such as filter() and map()
can be performed on sequences just as easily as on actual collections.
In this way, we can avoid creation of intermediate collections when we
chain operations together.
The functional programming approach involves using filter, map and a few
other functions as small, simple ‘building blocks’ that can be combined in
order to implement very complex manipulations of data in a collection.
This approach is much more powerful than one in which we have to write a large, complicated function every time we are faced with a data manipulation task. Complex manipulations are easier to understand when expressed as combinations of simple operations. Using sequences ensures that we can benefit from this expressiveness without sacrificing performance.
See the Kotlin language documentation for more information on how to use sequences.
Task 8.4.1
-
Examine the file
Sequences.kt, in thetasks/task8_4_1subdirectory of your repository. You should see code like this:fun main() { val numbers = listOf(1, 4, 7, 2, 9, 3, 8) val result = numbers // make changes here println(result) } -
Compile the program and run it. You should see the contents of
numbersdisplayed. -
Add
.asSequence()to the end of the second line of code inmain(), afternumbers. Recompile and run the program again. How has the output changed? -
Add
.filter { it % 2 != 0 }to the end of that line. Recompile and run the program again. What do you see now? -
Add
.map { it * it }to the end of the line. Recompile and run again. What has changed? -
The filtering and mapping operations added in the previous steps did not themselves produce any data. They merely extended the sequence with additional stages of processing. To retrieve usable values from the sequence, you must add a terminal operation to it.
Add
.toList()to the end of the line, then recompile and run again. This time, you should see a list of the squares of the odd integers fromnumbers.
Task 8.4.2
-
The
tasks/task8_4_2subdirectory of your repository is an Amper project that explores the performance impact of using sequences. OpenBenchmark.ktin thesrcsubdirectory of this project and study the code in this file.The operation performed by this program involves reading lines of text from a file, filtering out the blank lines, then filtering out all the lines containing at least 10 characters, then converting the lines that remain into all lowercase.
This operation is performed in two different ways, with and without the use of
Sequence. The execution time of each implementation is measured and displayed. -
Also in
tasks/task8_4_2is a large text file, containing the entire text of Leo Tolstoy’s notoriously long novel War And Peace. The file is over 3 MB in size and contains over 66,000 lines. Take a moment to examine its contents. -
In a terminal window, go to the project subdirectory and then run the performance benchmark on the file, using this command:
./amper run war-and-peace.txt
-
The equivalent feature in Python is the generator. ↩