Pitfalls of Processing a Stream from an External Program
How to design a standalone program that produces a big amount of binary data, and what are the pitfalls of the approach?
A good example is a file converter (images, mp3s, documents, etc).
Standalone Producer Application
There is many ways how to create a standalone application and one of the easiest and the most straight-forward approaches is Spring-Boot (pom.xml
):
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <!-- Your own application should inherit from spring-boot-starter-parent --> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>1.5.7.RELEASE</version> </parent> <groupId>cz.net21.ttulka.eval</groupId> <artifactId>StandaloneBytesProducer</artifactId> <version>1.0.0-SNAPSHOT</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <java.version>1.8</java.version> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter</artifactId> </dependency> <dependency> <groupId>commons-logging</groupId> <artifactId>commons-logging</artifactId> <version>1.2</version> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.16.18</version> <scope>provided</scope> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.7.0</version> <configuration> <source>1.8</source> <target>1.8</target> </configuration> </plugin> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build> </project>
StandaloneBytesProducerApplication.java
:
package cz.net21.ttulka.eval.bytesproducer; import org.springframework.boot.ApplicationArguments; import org.springframework.boot.ApplicationRunner; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import lombok.extern.apachecommons.CommonsLog; @SpringBootApplication @CommonsLog public class StandaloneBytesProducerApplication implements ApplicationRunner { @Override public void run(ApplicationArguments args) { log.info("StandaloneBytesProducerApplication started."); try { int bytesAmount = 1000; if (args.containsOption("bytes")) { bytesAmount = Integer.parseInt(args.getOptionValues("bytes").get(0)); } for (int i = 0; i < bytesAmount; i++) { System.out.write(i % Byte.MAX_VALUE); // we're writing on the standard output stream } } catch (Exception e) { log.error("Unexpected error.", e); System.exit(1); } System.exit(0); } public static void main(String[] args) throws Exception { SpringApplication.run(StandaloneBytesProducerApplication.class, args); } }
Compile it and run:
mvn clean package mvn spring-boot:run
It looks good. Of course a consumer will run it direct from a JAR:
java -jar target\StandaloneBytesProducer-1.0.0-SNAPSHOT.jar
The result is what we expected, Spring Boot ASCII logo, some log messages and our bytes stream.
And this is exactly one pitfall because all this junk destroys our result, actually all and only we need is the bytes stream.
Spring Boot uses it own logging (based on commons-logging
) hidden in the artifact spring-boot-starter-logging
. To get rid of it we can exclude this artifact from the build:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter</artifactId> <exclusions> <exclusion> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-logging</artifactId> </exclusion> </exclusions> </dependency>
When we now run the program, the log messages look different. After excluding the Spring Boot logging the commons-logging
uses its default fall-back implementation SimpleLog
.
SimpleLog
then sends all messages, for all defined loggers, to stderr
. We can prove it by forwarding the standard output into a file:
java -jar target\StandaloneBytesProducer-1.0.0-SNAPSHOT.jar > out.dat
Indeed, the log messages are still written in the console and the file includes only the Spring Boot logo and our bytes.
To get rid of the logo is easy, just put the application.yml
into the resources directory:
spring: main: banner-mode: "off"
Now the standard output contains only the result bytes. It's time to implement a consumer...
Standalone Consumer Application
Consumer could be done in the same manner, this time we don't case about logging much (pom.xml
):
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <!-- Your own application should inherit from spring-boot-starter-parent --> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>1.5.7.RELEASE</version> </parent> <groupId>cz.net21.ttulka.eval</groupId> <artifactId>StandaloneBytesConsumer</artifactId> <version>1.0.0-SNAPSHOT</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <java.version>1.8</java.version> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter</artifactId> </dependency> <dependency> <groupId>commons-logging</groupId> <artifactId>commons-logging</artifactId> <version>1.2</version> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.16.18</version> <scope>provided</scope> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.7.0</version> <configuration> <source>1.8</source> <target>1.8</target> </configuration> </plugin> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build> </project>
Important pitfall here to be aware about: all (stdout, stderr) the streams must be consumed. If you forget to consume the stderr
stream the program will freeze forever.
The error log can be either consumed and forgotten or consumed and print into the log:
package cz.net21.ttulka.eval.bytesconsumer; import java.io.IOException; import java.io.InputStream; import java.util.Scanner; import org.springframework.boot.ApplicationArguments; import org.springframework.boot.ApplicationRunner; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import lombok.extern.apachecommons.CommonsLog; @SpringBootApplication @CommonsLog public class StandaloneBytesConsumerApplication implements ApplicationRunner { @Override public void run(ApplicationArguments args) { String pathToJar = System.getProperty("PATH_TO_JAR"); log.info("StandaloneBytesConsumerApplication started: " + pathToJar); ProcessBuilder builder = new ProcessBuilder("java", "-jar", pathToJar); try { Process process = builder.start(); processErrors(process.getErrorStream()); processStream(process.getInputStream()); } catch (Exception e) { log.error("Unexpected error.", e); System.exit(1); } System.exit(0); } private void processStream(InputStream stream) throws IOException { int b; while ((b = stream.read()) != -1) { // TODO do something with the stream } stream.close(); } private void processErrors(final InputStream in) { new Thread(new Runnable() { @Override public void run() { int logLevel = 3; // 0 - ERROR, 1 - WARN, 2 - INFO, 3 - DEBUG Scanner scanner = new Scanner(in); while (scanner.hasNextLine()) { String line = scanner.nextLine(); if (line.startsWith("ERROR") || line.startsWith("FATAL")) { logLevel = 0; } if (line.startsWith("WARN")) { logLevel = 1; } if (line.startsWith("INFO")) { logLevel = 2; } if (line.startsWith("DEBUG") || line.startsWith("TRACE")) { logLevel = 3; } switch (logLevel) { case 0: log.error(line); break; case 1: log.warn(line); break; case 2: log.info(line); break; default: log.debug(line); break; } } } }).start(); } public static void main(String[] args) throws Exception { SpringApplication.run(StandaloneBytesConsumerApplication.class, args); } }
Compile and run it:
mvn clean package mvn spring-boot:run -DPATH_TO_JAR=..\StandaloneBytesProducer\target\StandaloneBytesProducer-1.0.0-SNAPSHOT.jar
Source codes: StandaloneBytesProducer and StandaloneBytesConsumer.
Happy byting!