Suppose that we have two data files with numbers. The numbers are stored in ascending order. Create a program that merges the two files, i.e. generates a third file where all the numbers from both of the files are stored in ascending order.
The algorithm is relatively simple:
The only complication to above algorithm comes when the one of the files is exhausted before the other. In this case we should take the next number from that file.
Further analysis of the algorithm identifies three major responsibilities:
If we encapsulate the number input into a NumberReader class, the class must provide the following services to the client:
As one might notice the NumberReader class is similar to the Iterator pattern, but because we might need to query one and the same value many times until it is actually used, the class doesn't fetch automatically the next value.
The constructor of the NumberReader class will take one argument - the file name. It will open the file and intialize the instance by fetching the first number.
The responsibility to compare numbers and build output might be encapsulated in a separate class MergeSort. It might be a little overhead for the problem space, but will add possibility for reuse. During the intialization, the MergeSort should receive number input and result output services.
The third service can be implemented directly by using one of the Java's IO classes. We will use Adaptor pattern and will encapsulate this functionality into a NumberWriter class. This class should provide only one service: writeValue().
Here is the code for the NumberReader class. We added a requirement the class to be able to read numbers separated by whitespaces, i.e. new lines, tabs and spaces. We also want that the reader automatically closes the file when end of file is reached.
/** * Class that implements a number queue by reading * a file in a sequential manner. * @author Ivan Georgiev */ import java.io.*; import java.util.StringTokenizer; class NumberReader { private BufferedReader reader; private StringTokenizer tokens = null; private double value; private boolean isValid; public NumberReader(String fileName) throws FileNotFoundException, IOException { reader = new BufferedReader(new FileReader(fileName)); isValid = true; fetchNext(); } /** * Checks the queue status. * @return boolean Returns true if the current value is valid or false * if the queue is empty and current value is not valid. */ public boolean hasValue() { return isValid; } /** * Get the current value, retrieved from the queue. * @return double Returns the current value. */ public double getValue() { return value; } /** * Fetch the next value from the queue and make it * available through the getValue() method. * @return boolean Returns true if next value is available * (see hasValue() method.) */ public boolean fetchNext() throws IOException, NumberFormatException { if (isValid) { isValid = false; if (tokens == null || ! tokens.hasMoreTokens()) { String sLine = reader.readLine(); if (sLine == null) { tokens = null; } else { tokens = new StringTokenizer(sLine); } } if (tokens != null) { value = Double.parseDouble(tokens.nextToken()); isValid = true; } if (!isValid) { reader.close(); } } return isValid; } }
The code for the NumberWriter class is relatively simple.
/** * Class to encapsulate number output. * @author Ivan Georgiev */ import java.io.*; class NumberWriter { private PrintWriter printer; public NumberWriter(String fileName) throws IOException { printer = new PrintWriter(new BufferedWriter(new FileWriter(fileName))); } public void writeValue(double value) { printer.println(value); } public void close() { printer.close(); } }
Having implemented the NumberReader and NumberWriter and delegated the IO tasks to them, we can focus on the core of the MergeSort class.
/** * Class to implement merge sort. * @author Ivan Georgiev */ class MergeSort { private NumberReader source1; private NumberReader source2; private NumberWriter writer; public MergeSort(NumberReader source1, NumberReader source2, NumberWriter writer) { this.source1 = source1; this.source2 = source2; this.writer = writer; } public void run() throws IOException, NumberFormatException { NumberReader usedReader; while (source1.hasValue() || source2.hasValue()) { if (!source1.hasValue()) { usedReader = source2; } else if (!source2.hasValue()) { usedReader = source1; } else if (source1.getValue() < source2.getValue()) { usedReader = source1; } else { usedReader = source2; } writer.writeValue(usedReader.getValue()); usedReader.fetchNext(); } writer.close(); } }
Here is a simple program to test the application.
/** * The problem: * Suppose that we have two data files with numbers. * The numbers are stored in ascending order. Create * a program that merges the two files, i.e. generates * a third file where all the numbers from both of * the files are stored in ascending order. * * @author Ivan Georgiev * @version 1.00 2008/9/8 */ import java.io.*; import java.util.StringTokenizer; public class FileMergeApplication { private static final String sourceName1 = "../data/source1.txt"; private static final String sourceName2 = "../data/source2.txt"; private static final String targetName = "../data/output.txt"; public static void main(String[] args) { try { NumberReader reader1 = new NumberReader(sourceName1); NumberReader reader2 = new NumberReader(sourceName2); NumberWriter writer = new NumberWriter(targetName); MergeSort sort = new MergeSort(reader1, reader2, writer); sort.run(); } catch (Exception e) { e.printStackTrace(); } } }
source1.txt
5 15 35 80
source2.txt
10 20 30 40 50 60
output.txt
5.0 10.0 15.0 20.0 30.0 35.0 40.0 50.0 60.0 80.0
The source2.txt has some numbers separated a space and a tab character to test that NumberReader supports the additional requirement for any whitespace separator.