Mashed Code Magazine 2012 Published
Starting right now, you can get your electronic copy of the 2012 issue
of Mashed Code Magazine. There are three file formats available for
download: PDF (computer, iPad), EPUB (B&N Nook), and MOBI (Kindle).
All three formats will be downloadable, for free, from www.mashedcodemagazine.com.
Aside from the free download, there are a couple other ways to get
your copy. If you love and use your Kindle, you can get it straight
from the Kindle store on your device for $0.99 (the lowest price we
can charge). Just search for the term “Mashed Code” in the Kindle
store to find it. We’re experimenting with this option this year, so
please send feedback on it, especially if you find the process is less
than perfect.
December Issue of NFJS, the Magazine published.
Here’s what is in this month’s NFJS, the Magazine
Brian Sletten – BDD and REST
Forward-looking development teams have started to use Behavior-Driven Development (BDD) in the past few years to test their code against clearly expressed acceptance criteria. The adoption of tools like Cucumber, JBehave, RSpec and EasyB show that this trend is growing. As we see an increase in the use of resource-oriented APIs, there is an opportunity to apply these testing ideas to make sure our services do what they are supposed to, maintain high quality and avoid accidental breakage.
Ken Kousen – Mocks and Stubs in Groovy Tests
True unit testing means isolating the class you’re testing from all of its dependencies. If the dependencies are simple, you can write your own stub classes and maintain them yourself. Java provides several mocking frameworks that allow you to set expectations and verify the interactions between your class and the mocks. If you have Groovy available, however, you have many capabilities built right into the language. This article will show you how to use Groovy to generate both mocks and stubs in an easy, controlled way.
John Griffin -Algorithms for Better Text Search Results
This article discusses four enhancements to searching: utilizing synonyms to broaden results, reducing search terms to their most basic form, fuzzy queries (there’s that word again) to help with misspellings and “did you mean?” suggestions, and finally, phonetic equivalents for search terms. We will be using code written in both Lucene and Hibernate Search format so some experience with these packages will be helpful but not absolutely necessary. Regardless of what your favorite language might be, these examples will give you ideas to use in your own code.
Brian Tarbox – Knowns and Unknowns of Scrum and Agile
One of the premises that Scrum operates from is that most interesting projects have constantly shifting requirements and exist largely in isolation from any previous implementation or design. Anyone mentioning static requirements or things we “know to be true” isn’t “agile”. This leads to missing the advantages gleaned from previous projects; which is odd given Scrum’s penchant for transparency and feedback. Scrum is great at responding to unknowns found during a sprint, but not so good at doing the same with facts from outside a sprint. As an overreaction to the Big Design Upfront paradigm, Scrum tends to view all projects as swimming in a sea of shifting requirements with little or nothing known upfront. It also tends to assert that we can succeed in spite of, or even because of, all of the variables facing a project. This paper asserts that this is a serious limitation of Scrum and shows what this limitation causes us to miss.
Here’s what is in next month’s NFJS, the Magazine
This is the last issue of the year. The magazine will return in March, 2012.
I’m very proud of the work we do on this new magazine. The staff and I have worked hard to produce a top-notch magazine that is unique in the realm of software development magazines. The magazine costs $50 per year, which includes 10 issues. Each issue has at least four articles. You can download in a print-quality PDF and two mobile formats: EPUB (for the Nook and iPad) and MOBI (for the Kindle). The articles are professionally edited and are written by top experts in their field, so the content is worth well more than the $50 you pay.
The June issue just published this morning and you can subscribe here: http://bit.ly/fETp6d. As always, if you have questions just comment on this post and I’ll respond quickly.
The Language of Computer Science: Algorithm
Like many software developers, I think it’s interesting to know at least something of the history of computer science. Programming languages, for one, offer up genuinely intriguing tales when the history of their making is told. But there are programming languages and there is the language with which we talk about programming—the jargon of our trade. Computer science jargon has origins that are storied, just like other aspects of the trade possess. For those who are interested in the history of the words we use, here is a brief look at one of them—algorithm.
Despite an algorithm’s propensity towards a precise set of steps, the word’s origins are fuzzy and the path to its current form twisted. The noun algorithm is what the Oxford English Dictionary (OED) connotes as one of many “psuedo-etymological perversions” that transformed the surname of a 9th century mathematician into a mainstay of modern computer science jargon. An Arabic mathematician named Abu Ja’far Mohammed Ben Musa wrote a text on Algebra which was subsequently translated from Arabic to Old French. The surname of the text’s author, al-Khowārazmī, was later translated to the word algorisme in Old French. Another form, algorism, has since been the basis for all sorts of mangling of a word that originally was just a name for the Arabic/Indian numeral system, better known as the Decimal system.
So given that, it makes perfect sense that the true origin of algorithm is actually Greek, right? The Greek language actually does enter our story here, though it’s not the true origin of the word. The Greek word for “number”, arithmos, and algorithm have been “learnedly confused” in the past to be the same word. The correct lineage though is that algorithm is just a form of algorism that has crept up in the late 19th century to mid 20th century.
From Arabic surname, to Old French, to being confused with a Greek word with a similar meaning, you can see that even the history of word can be interesting. You won’t get better at coding from reading this but you might better appreciate the river of change in which your computer science lexicon exists. If nothing else, now you’ve got some new knowledge to flaunt around at your next code review.
October Issue of NFJS, the Magazine published.
Here’s what is in this month’s NFJS, the Magazine
Tim Berglund – Plugging into Gradle Plugins
Gradle is a next-generation build system designed to provide the right balance between conventions and customization. Based on a Groovy DSL, Gradle makes it very easy to add arbitrary logic into a build script, but this approach is hostile to maintainability and will not be compatible with the tooling solutions likely to emerge over the next year. Instead, Gradle provides a powerful plugin architecture to extend the functionality of the tool and the DSL it exposes to the build masters and users of the build. In this article, we’ll look at how to program and package Gradle plugins.
Jeremy Deane – Enterprise Integration Agility
According to Programmable Web in 2010 the rate of growth in public Web APIs doubled1. This exponential trend continues in 2011 resulting in an ever more connected web. This connected contagion is not just relegated to the domain of Web 2.0 but has infected the corporate world. In fact, companies are becoming more reliant on Software as a Service (SAAS) to provide key business functions. In this article, we will explore several options for rapidly delivering flexible-integrated solutions.
Johnny Wey – Relax with CouchDB
There was a time, not terribly long ago, when choosing a data persistence technology was a relatively simple task.
Mark Volkmann – Sass…CSS Evolved
Cascading Style Sheets have a simple syntax for specifying the formatting properties to be applied to HTML content. Many would say the syntax is too simple. It is difficult to avoid repeating properties in multiple CSS rules. Sass addresses that issue and more, making it possible to keep formatting descriptions DRY. Being DRY allows a change to one property to affect the formatting of many related elements. This article assumes basic knowledge of CSS.
Here’s what is in next month’s NFJS, the Magazine
Matthew McCullough – Game Theory for Software Developers: Economics & Statistics in the Domain of Programmers
Game Theory is a fascinating focused study of strategic interactive decision making that originated in the 1950s with help from our computer science founding father, John von Neumann. It began with a focus on economics, later expanded to military battle strategy, and now even aims to describe the behavior of business teams scaled all the way up to the size of nations. It offers its students an insightful set of descriptions and simplifications, much like those of design patterns, that clarify behavior that might otherwise be described as “irrational.” Such behavior, it turns out, is rarely irrational, but merely driven by motivations that were previously just misunderstood or overlooked. This article will allow you to peer through this new lens, making the underlying motivations in the world of business, team interactions, product pricing, advertising and service offerings crystal clear. This Game Theory view, enabled by the ideas in this article, will provide professional guidance and employment tactics for engineers looking to extract maximum career and income benefits for their labor in the realm of software development.
Jessica Kerr & Ted Neward – Guava: an excellent source of vitamin C (for Concurrency)
Dive deeper into one of the biggest timesavers in Google’s Guava library: MapMaker. More than a static factory, MapMaker creates an insta-cache. The cache can calculate values on demand, expire entries as they reach a specific age, and avoid taking up too much memory. Best of all, it is completely thread-safe. While we’re at it, we’ll cover some of the other utilities in Guava that make concurrency in Java a little less painful. Guava fruit is supposed to help with high blood pressure, and these tricks just might help with yours.
Matt Stine – Vagrant: Virtualized Development Environments Made Simple
Have you ever wished that your local development sandbox could look exactly like production, but you’ve got a mismatch between your local OS and your production OS? And what about the age old “it works on my machine” excuse that quite often stems from differences between developer sandboxes? Many have turned to virtualization, creating a machine image that can be passed around the team. But who manages the template? How do you keep things in sync? In this article, we’ll explore Vagrant, an open source tool that allows you to easily create and manage virtual development environments that can be provisioned on demand and “thrown away” when no longer needed.
Mark Richards – High Performance Messaging
As Woody Allen once said “It is impossible to travel faster than the speed of light, and certainly not desirable, as one’s hat keeps blowing off”. While messages traveling through your system may never quite reach the speed of light, you could certainly make them travel fast. In this article I will explore four simple techniques that will increase the speed and throughput of your messaging system through relatively minor changes in your messaging infrastructure. So hold on to your hats and enjoy the ride (it’s a fast one).
I’m very proud of the work we do on this new magazine. The staff and I have worked hard to produce a top-notch magazine that is unique in the realm of software development magazines. The magazine costs $50 per year, which includes 10 issues. Each issue has at least four articles. You can download in a print-quality PDF and two mobile formats: EPUB (for the Nook and iPad) and MOBI (for the Kindle). The articles are professionally edited and are written by top experts in their field, so the content is worth well more than the $50 you pay.
The June issue just published this morning and you can subscribe here: http://bit.ly/fETp6d. As always, if you have questions just comment on this post and I’ll respond quickly.
MashedCode Magazine Open Submissions
Mashed Code Magazine is a magazine published as a compliment to the wildly popular CodeMash conference. The 2012 issue of the magazine will again be publishing articles written by the community. Submissions will be accepted starting September 15, 2011 through October 2, 2011.
We are asking that your submission be a completed article, ready for print. Keep them interesting to grab both our and our reader’s attention! We will be looking for the following criteria for submitted articles:
- Between 300 and 500 words in length (similar to last year’s lightning articles).
- Topic is up to the author, but please keep it similar to typical CodeMash topics.
- Please do not use images or tables (text only).
- Editing should be completed by author before submission. Only minor editing will be completed by Mashed Code Magazine.
Articles not meeting these standards will not be considered for publication.
Mashed Code Magazine will make an attempt to print as many articles as possible, but submission does not guarantee inclusion in either the magazine or the website. You will be notified upon selection if your article will be included in the Magazine, our website, or both.
Go to http://www.mashedcodemagazine.com to submit your article.
September Issue of NFJS, the Magazine published.
Here’s what is in this month’s NFJS, the Magazine
Neal Ford – Build Your Own Technology Radar
ThoughtWorks’ Technical Advisory Board creates a “technolgy radar” three or four times a year. It is a working document that helps the company as a whole make decisions about what technologies are interesting and where we should be spending our time. This is a useful exercise both for you and your company. This article focuses on why you should undertake this exercise, both for your company and your own career development.
Venkat Subramaniam – Programming with Scala Traits – Part Two
In this part two of the series, Venkat Subramaniam will discuss how to apply multiple traits at both class and instance level and to implement the decorator pattern.
Brian Sam-Bodden – MVC Meet JavaScript, JavaScript Meet MVC
For years the software community has been pushing the MVC architectural pattern to organize and separate the concerns of our applications. So far we seem to have done a decent job of accomplishing that based on the enforcement of the pattern in the most successful web frameworks such as Rails, Grails, JSF, Struts and many others. The last frontier for MVC seems to be the sometimes convoluted world of JavaScript; the client tier of our web applications. Although frameworks like jQuery, Prototype, Scriptaculous, ExtJS, DOJO and others have greatly helped in cleaning up and structuring the client tier, there’s still much to be desired. In recent years several micro-frameworks have appeared that aim to put an end to the madness of the JavaScript client tier world. In this article we’ll explore the most prominent players and see how their usage impacts modern web development.
Hamlet D’Arcy – Better DSLs with Groovy Command Expressions
Domain Specific Languages (DSLs) are often littered with the accidental complexity of the host language. Have you seen a supposedly “friendly” language expression like ride(minutes(10)).on(bus).towards(Basel)? The newest version of Groovy contains a language feature that aims to eliminate the noise of all those extra periods and parenthesis so that your DSL looks more like “ride 10.minutes on bus towards Basel”. This article shows you, step-by-step, how to use Groovy Command Expressions and plain old metaprogramming to write just this DSL and also offers advice on when, and when not, to use this new language feature.
Here’s what is in next month’s NFJS, the Magazine
Tim Berglund – Gradle Plugins
Gradle is a next-generation build system designed to provide the right balance between conventions and customization. Based on a Groovy DSL, Gradle makes it is very easy to add arbitrary logic into a build script, but this approach is hostile to maintainability and will not be compatible with the tooling solutions likely to emerge over the next year. Instead, Gradle provides a powerful plugin architecture to extend the functionality of the tool and the DSL it exposes to the build masters and users of the build. In this article, we’ll look at how to program and package Gradle plugins.
Jeremy Deane – Enterprise Integration Agility
According to Programmable Web in 2010 the rate of growth in public Web APIs doubled. This exponential trend continues in 2011 resulting in an ever more connected web. This connected contagion is not just relegated to the domain of Web 2.0 but has infected the corporate world. In fact, companies are becoming more reliant on Software as a Service (SAAS) to provide key business functions. In this article, we will explore several options for rapidly delivering flexible-integrated solutions.
Johnny Wey – Relax with CouchDB
NoSQL is big these days. There are myriad outstanding options and each has strengths and weaknesses. CouchDB is a document-oriented schema-less database expertly tailored for the web. It is extremely fault-tolerant and error resistant, easy to understand and query and even has a mobile version for iOS and Android development!
In this article, Johnny will introduce CouchDB the database and demonstrate the API, replication and the problems CouchDB was born to solve.
Mark Volkmann – Sass…CSS Evolved
Cascading StyleSheets have a simple syntax for specifying the formatting properties to be applied to HTML content. Many would say the syntax is too simple. It is difficult to avoid repeating properties in multiple CSS rules. Sass addresses that issue and more, making it possible to keep formatting descriptions DRY (http://en.wikipedia.org/wiki/Don’t_repeat_yourself). Being DRY allows a change to one property to affect the formatting of many related elements.
I’m very proud of the work we do on this new magazine. The staff and I have worked hard to produce a top-notch magazine that is unique in the realm of software development magazines. The magazine costs $50 per year, which includes 10 issues. Each issue has at least four articles. You can download in a print-quality PDF and two mobile formats: EPUB (for the Nook and iPad) and MOBI (for the Kindle). The articles are professionally edited and are written by top experts in their field, so the content is worth well more than the $50 you pay.
The June issue just published this morning and you can subscribe here: http://bit.ly/fETp6d. As always, if you have questions just comment on this post and I’ll respond quickly.
August Issue of NFJS, the Magazine published.
Here’s what is in this month’s NFJS, the Magazine
Venkat Subramaniam – Programming with Scala Traits – Part One
In object modeling, mixins provide a way to create abstractions that define a common functionality that can be mixed into different abstractions. As Java programmers we did not quite have the ability to make use of this powerful concept, especially because Java does not provide multiple inheritance. In languages that do provide multiple inheritance, like C++, this is quite difficult to use in practice due to problems with method collision. Scala elegantly supports compile time mixins through traits. In this first part of the series, we’ll learn about traits in Scala and how to use them. In the second part, we’ll learn how to apply multiple traits both at class and instance levels.
Raju Gandhi - On Eloquent Conversations – Part Two
In the first installment of this series, we discussed the need for integration, and some of the potential pitfalls, especially when attempting to roll your own integration system. We then proceeded to discuss some of the patterns in Gregor Hohpe’s and Bobby Woolf’s aptly named “Enterprise Integration Patterns” and their corresponding implementations in Spring Integration. We discussed the core patterns that make up the founding blocks of Spring Integration – “Message Channel”, “Message” and “Message Endpoint”. In this article we will explore a few more patterns that will allow you to route, filter and manipulate messages as well as talk to external systems. We will learn how to do this while leveraging Spring’s declarative model that lets you focus on your domain, and let Spring Integration handle the specifics of messaging.
Craig Walls - NoXML: Spring for XML Haters
In spite of all of the great things Spring brings to Java development, one criticism it has received a lot of over the years is its heavy use of XML for configuration. It’s true that Spring configuration has traditionally required XML. Lots of XML. It seems that XML has fallen out of favor with many developers. And for those who are card-carrying members of the He-Man XML Haters Club, it’s hard to see the benefits of Spring through the haze of XML. If you’re among the XML haters, then this article is for you. Each version of Spring has taken steps to lighten the XML burden and I’m going to show you a few tricks from the latest versions of Spring that make it possible to develop a Spring application with minimal or even no XML whatsoever. To illustrate these techniques, I’ve written a simple Guestbook application using common Spring XML configuration. Throughout this article, we’ll swap XML configuration for Java configuration, until there is no more XML left in the project. If you want to follow along, you can download the before and after projects from this magazine’s download URL.
Scott Leberknight - Handling Big Data with HBase
In the past few years we have seen a veritable explosion in various ways to store and retrieve data. The so-called NoSql databases have been leading the charge and creating all these new persistence choices. These alternatives have, in large part, become more popular due to the rise of Big Data led by companies such as Google, Amazon, Twitter, and Facebook as they have amassed vast amounts of data that must be stored, queried, and analyzed. But more and more companies are collecting massive amounts of data and they need to be able to effectively use all that data to fuel their business. For example social networks all need to be able to analyze large social graphs of people and make recommendations for who to link to next, while almost every large website out there now has a recommendation engine that tries to suggest ever more things you might want to purchase. As these businesses collect more data, they need a way to be able to easily scale-up without needing to re-write entire systems.
Here’s what is in next month’s NFJS, the Magazine
Neal Ford – Build Your Own Technology Radar
ThoughtWorks’ Technical Advisory Board creates a “technolgy radar” three or four times a year. It is a working document that helps the company as a whole make decisions about what technologies are interesting and where we should be spending our time. This is a useful exercise both for you and your company. This article focuses on why you should undertake this exercise, both for your company and your own career development.
Venkat Subramaniam – Scala Traits Part Two
In this part two of the series, Venkat Subramaniam will discuss how to apply multiple traits at both class and instance level and to implement the decorator pattern.
Brian Sam-Bodden – MVC Meet JavaScript, JavaScript Meet MVC
For years the software community has been pushing the MVC architectural pattern to organize and separate the concerns of our applications. So far we seem to have done a decent job of accomplishing that based on the enforcement of the pattern in the most successful web frameworks such as Rails, Grails, JSF, Struts and many others. The last frontier for MVC seems to be the sometimes convoluted world of JavaScript; the client tier of our web applications. Although frameworks like jQuery, Prototype, Scriptaculous, ExtJS, DOJO and others have greatly helped in cleaning up and structuring the client tier, there’s still much to be desired. In recent years several micro-frameworks have appeared that aim to put an end to the madness of the JavaScript client tier world. In this article we’ll explore the most prominent players and see how their usage impacts modern web development.
Hamlet D’Arcy – Better DSLs with Groovy Command Expressions
Domain Specific Languages (DSLs) are often littered with the accidental complexity of the host language. Have you seen a supposedly “friendly” language expression like ride(minutes(10)).on(bus).towards(Basel)? The newest version of Groovy contains a language feature that aims to eliminate the noise of all those extra periods and parenthesis so that your DSL looks more like “ride 10.minutes on bus towards Basel”. This article shows you, step-by-step, how to use Groovy Command Expressions and plain old metaprogramming to write just this DSL and also offers advice on when, and when not, to use this new language feature.
I’m very proud of the work we do on this new magazine. The staff and I have worked hard to produce a top-notch magazine that is unique in the realm of software development magazines. The magazine costs $50 per year, which includes 10 issues. Each issue has at least four articles. You can download in a print-quality PDF and two mobile formats: EPUB (for the Nook and iPad) and MOBI (for the Kindle). The articles are professionally edited and are written by top experts in their field, so the content is worth well more than the $50 you pay.
The June issue just published this morning and you can subscribe here: http://bit.ly/fETp6d. As always, if you have questions just comment on this post and I’ll respond quickly.
Getting Started: Testing Concurrent Java Code
I recently finished the last class of my Master of Science in Computer Science program at Franklin University. I had to write a short paper for that class that I think is worth sharing with you. The paper was written with the class as the audience, so it’s a little simpler and lot less detailed than it should be. Nonetheless, I think it has some merit to programmers who are trying to get started with testing their concurrent code. Now that I’m done with school, I hope to explore this topic more fully, blogging the entire way. Here’s the contents of the paper.
Abstract
Over the life of the Java language, Java developers have evolved their development life cycle into a complex and sophisticated ecosystem of tools, practices and conventions. Rather than relying on gut feelings and verbal agreements that software has been tested just “good enough” to go to production, the modern Java developer relies on metrics about the passing rate of numerous types of tests and code quality before deciding that code is production ready. Unfortunately, this only applies to sequential Java code—not concurrent code. Despite having more complex metrics than sequential code, there is no excuse for excluding multithreaded Java code from this modern development environment. This paper will help you to collect the tools and practices needed to modernize the way you develop concurrent Java code.
Acknowledgements
The author would like to thank Venkat Subramaniam and Daniel Hinojosa for their careful review of this paper and their thoughtful comments.
Introduction
—The problem here is that programmers are not as scared of using threads as they should be.
David Hovermeyer and William Pugh
Creating software that can be run by multiple threads concurrently is a daunting task—dwarfed only by the act of testing that code. Nonetheless, concurrent code can be tested. However, to get to the topic of testing, some groundwork has to be laid out first. There are some contextual considerations that one must cover before being able to usefully test concurrent code. This paper will work through those fundamental concerns and how to address them in an effort to introduce how to get started with testing concurrent Java code.
The Modern Java Development Model
There is no standard way to develop Java programs. However, talk among amongst Java developers, speeches by consultants and conference speakers and the writing of bloggers and professional authors all reverberate with common practices upheld by Java programmers today. From this body of common knowledge you can glean what are the basic tools and processes used by the modern Java developer.
The modern Java developer does all of their development from an Integrated Development Environment (IDE) such as Eclipse, NetBeans or IntelliJ IDEA. In the IDE, they are able to perform quick and frequent refactorings (Fowler) of their code which are enabled by unit tests that constantly verify the correctness of the changes. The modern Java developer understands unit testing and, whether following the Test Driven Development (Beck) prescriptions or not, spends much of their coding time writing those tests. When the code is committed to source control, the more automated part of the process begins when some form of Continuous Integration (CI) (Duvall, et. al.) system picks up the new code. A full compile and run of the entire unit test suite is run by the CI system, at a bare minimum, in order to continually verify the code. If they exist, more involved automated integration, functional, regression or user acceptance tests can be run to provide further verification.
Aside from compilation and testing, the modern Java developer relies on metrics. During the CI build it is typical to run a battery of tools against the code that gather metrics about its health. Number of lines of source code counts, various static analysis tools and code coverage tools are all common. The developer understands how to interpret the output of these tools so their findings can be used to improve the code. Improvements are implemented by making modifications; from simply adding missing unit tests to fixing latent bugs.
What is implicit in this collection of tools and practices is that only sequential Java code is being reviewed. How to work with multithreaded code in this modern Java development environment is never discussed. The remainder of this paper will help you begin to locate, test and verify multithreaded code in a way that fits into your role as a modern Java developer.
Finding Concurrency Bugs
You cannot test code if you are not aware of its existence. So before you can start developing tests for your multithreaded code, you must locate it in your code base. If you are writing the tests as you write that code, you don’t need this section. However, if you want to test existing code, you can use the tips here to help you hone in on the code you might want to test for multithreaded correctness.
Locating Multithreaded and Related Code
Part of the reason that working with multithreaded Java code is so difficult is that it is not trivial to figure out what code could be interacted with by multiple threads at runtime. Start thinking about this problem by asking yourself the simple question “do I know where my multithreaded code is in my codebase?”—it’s highly likely that you cannot answer that question accurately. While it is a sensible idea to keep multithreaded code in isolation, you may find that your multithreaded code is sprinkled throughout your code base. There are two simple ways to get a much better idea of where that code is: by performing some simple text searches and by working with your peers.
There are several easy, yet fast and effective tools at your disposal that will seek out multithreaded code. The modern Java developer’s IDE is the most convenient to use. However, some simple command line tools can do just as well. In both cases, the onus is on the developer to determine what code could be subject to failure when being run simultaneously on multiple threads.
There is no tool in your IDE that will list for you all the source code that might be run by multiple threads, so you have to devise a way to find that code. Listing 1 is a collection of text strings that can be searched on that provides a good start to finding threaded code. All of the strings in the listing will uncover code that defines a thread, calls methods of the Thread object or uses low-level locking mechanisms in the Java Concurrency API.
"implements Runnable", "extends Thread", "synchronized", ".notify()", ".notifyAll()", ".wait()", ".wait(...)", ".interrupt()", ".interrupted()", ".join()", ".join(...)", ".sleep(...)", ".yield()", "import java.util.concurrent.Atomic", "import java.util.concurrent.locks", “InterruptedException”
Listing 1: Text searches that help uncover concurrent code.
Doing a simple text search for these strings in your IDE will reveal much of the code that you are looking for[1]. If you prefer the command line, you can use the Unix find command as in Listing 2 to do a similar search.
find . -name *.java -exec grep -n -H --color "implements Runnable" {} \;
Listing 2: Using the Unix find command to search for concurrent code.
These searches can only yield so much, however. Any Java class is subject to being instantiated and used by a thread, which is not always as easily identifiable. For example, Listing 3 demonstrates how any Java class can be used in a thread. Once you locate the class SimpleThread from the above search, you have to look into that class to identify what classes it uses since those referenced classes are now subject to multithreading concerns.
public class SimpleThread implements Runnable {
private JavaClass jc;
public void run() {
jc = new JavaClass();
}
}
Listing 3: A typical Java class referenced from within a thread.
Simple text-based searches are an easy way to start to identify what code may have to be tested, but they do not provide any information on the nature and intent of the code. For instance, some of the code may be known to the programmers to not need thread safety. For other parts of the code, it may be a total surprise that it could be run by multiple threads. Given that, it is a good idea to implement a peer review process to analyze the severity of the need to test the found code for multithreaded safety (Goetz). By having an experienced part of the team review the code, many important contextual details will be flushed out that are not evident to a single programmer. The findings from the peer review can be used to guide the testing strategy.
Using Static Analysis to Pinpoint Concurrency Bugs
Understanding where concurrency exists in your code is a necessary precondition to testing that code. However, just locating the code does not identify bugs in it. So you need to progress from finding concurrent code to finding bugs in it and then testing to verify that it is bug free—all in a fashion that fits into your modern development environment. Static analysis tools offer a way to achieve the middle step by identifying some bugs in your concurrent code. This section will discuss an Open Source static analysis tool called FindBugs (Hovemeyer & Pugh).
FindBugs is a reliable tool for picking out all sorts of “bug patterns” in Java code. FindBugs uses static analysis and heuristics to seek out code patterns and API usage that are indicative of or known to cause bugs (Hovemeyer & Pugh). The tool is effective enough, and sufficiently rid of false-positives, that it has been employed to look for bugs on production code at Google (Ayewah, et. al.). More pertinent to this discussion, however, is the fact that FindBugs has the capability to find bugs in concurrent code. In fact, FindBugs is the sole static analysis tool suggested by Brian Goetz for use with multithreaded code in the chapter “Testing Concurrent Programs” in (Goetz). That same chapter has a summary of the concurrent bug patterns that FindBugs can identify, including inconsistent synchronization, unreleased locks, notification errors and spin loops.
Goetz’s chapter on concurrent testing is out-of-date on one point. He praises tools like FindBugs as being “effective enough to be a valuable addition to the testing process” but cautions that they “are still somewhat primitive (especially in their integration with development tools and lifecycle)”. Goetz is referring to the fact that, in its infancy, FindBugs was only available as a clunky Java Swing application. This is no longer the case as FindBugs is now commonly integrated into the modern Java developer’s tool set. As a basis, there is an Ant task for FindBugs that can be configured to produce XML output. Further, several CI tools, such as Hudson, Jenkins, Bamboo and Sonar have plugins that can produce sophisticated dashboards from the FindBugs XML output. The Ant task allows for FindBugs analysis to be run automatically as part of a build process where the XML output can then be picked up by the CI plugin to create the dashboard. The dashboards include all of FindBugs findings, but make it simple to drill-down into just the concurrency bugs.
Figure 1: The FindBugs summary page in Hudson.
FindBugs is the second step in the chain of concurrency testing tools mainly because it is so simple to use. To use the Ant task you just tell FindBugs where to find the compiled .class files, where to locate the source code and where to write the output file. Once the XML output has been produced, the CI plugins for FindBugs generally only require as input a path to that file to work. The payoff for following such simple steps is huge. Figure 1 shows a view of the FindBugs dashboard plugin for the Hudson CI server that lists the number of bugs found in the “multithreaded correctness” category[2]. Figure 2 shows the dashboard’s ability to pinpoint the exact lines of code that contain the bugs. In this case, the class SubmitBrochureOrder contains three concurrency bugs (mutable Servlet fields).
Figure 2: The FindBugs dashboard showing bugs in the SubmitBrochureOrder class.
Measuring The Amount Of Concurrent Code Tested
So far, this paper has covered two processes: locating what concurrent code exists in your code base and determining if your code contains any common concurrency bug patterns. Another precursor to actually testing the code is to determine how much of the concurrent code is actually being tested with respect to thread safety. Modern Java developers are used to the measurement of “code coverage”(Miller & Maloney) and almost ubiquitously run both a battery of unit tests and code coverage analysis as a fundamental part of their build process. It is common to even base the health of a code base in part on the amount of code coverage. There are well known caveats with this practice (Glover) yet it is still an effective way to get a sense for whether attention is paid to unit testing the code or not.
There is hardly an argument against the usefulness of code coverage analysis, but only for serial code. The typical code coverage analysis tools used by modern Java developers, such as Cobertura and Emma, are not built to measure how well the concurrent aspects of the code are covered by unit tests. In fact, there is very little use currently of unit testing for concurrency, though Goetz does briefly explain how the concurrent code in the Java APIs is unit tested (Goetz).
To reconcile this problem, a team of researchers at IBM have developed the theory and tools to measure the extent to which concurrent code is tested for thread safety and have dubbed it “synchronization coverage”. This concept is not analogous to the code coverage metric for serial code. Whereas serial code coverage measures the percentage of source code lines, branches, methods and classes that are executed during testing, synchronization coverage measures the percentage of critical sections of multithreaded code that are exercised by more than one thread concurrently during test runs. Synchronization coverage is an umbrella term for multiple “coverage tasks”, each of which measures a different aspect of the thoroughness of the concurrent testing. As an example of a coverage task, if, while running tests, a synchronized block of code is not accessed by multiple threads concurrently, that block of code is not considered to have coverage. Oppositely, if one or more threads has to contend for the lock on that critical section of code, the code is considered to be covered (Bron, et. al.).
This metric can be quite revealing, especially if you are just starting to add tests to an existing code base. Obtaining the synchronization coverage of your code base will immediately tell you if the thread-safety of your code is being tested at all by your existing test suite. Since you now know where your concurrent code is located and have already weeded out easy-to-find concurrency bugs in it, you can use synchronization coverage to plan for what to actually start testing. The next section covers testing concurrent code, which includes using a tool from IBM ConTest that implements the synchronization coverage metric.
Testing Concurrent Code
As with testing in general, there are many methods for testing concurrent code (Watts). This section will focus on two specific modes of testing that fit particularly well in the modern Java developer’s portfolio of tools. The two forms of testing are unit testing and automated exploratory testing. Both types can be used to test threaded code to ensure that it works properly when run on multiple threads concurrently.
There are few options for unit testing concurrent code. However, one promising option is the MultithreadedTC library(Pugh & Ayewah). This library uses the novel abstraction of a “metronome tick” to provide a mechanism for sequencing the interleaving of multiple threads. The library automatically moves to the next “tick” every time all the threads in a running MultithreadedTC unit test are blocked. The tester can then assert that various conditions are held for a specific tick. The metronome tick allows MultithreadedTC unit tests to verify the correctness of code’s multithreaded behavior in a way that does not interfere with the natural scheduling of threads by the JVM.
MutlithreadedTC is promising for two reasons. First, it is highly automatable, which fits perfectly into the modern Java developer’s environment. MultithreadedTC is built on JUnit, which makes it easy to run in an Ant script as part of the CI build. This also means that it produces JUnit reports that can be viewed on the JUnit dashboard of a CI server. Second, MultithreadedTC is a rare tool in that it gives the developer the control to test very specific threading scenarios. Any number of threads can be used in a test and the metronome tick allows the finest granularity of testing possible. This is a powerful and unique technique that does not exist in many other multithreaded testing tools.
In contrast, IBM ConTest is a no less useful and important tool that takes control away from the tester (Edelstein, et. al.). Rather than asking the tester to tediously code fine-grained threading scenarios, ConTest randomly explores as many of the thread interleavings as possible. The tester gets to write tests that more closely resemble serial unit tests, which ConTest then runs across a varying number of multiple threads simultaneously over a period of time. The idea behind ConTest is to help reduce the complexity of concurrency testing by covering as many threading scenarios as possible. In doing so, the chance of finding a concurrency bug increases.
ConTest is also highly automatable. It produces reports on the number of test passes and failures and even includes metrics on synchronization coverage. Since it simply instruments your code, the developer can use any tests written for that code when running ConTest. In most modern Java development CI systems, this amounts to adding two extra steps to the build process.
Conclusion
Modern Java developers are comfortable with the cycle of developing code and tests simultaneously, building the code using a slew of tools for automation and then automatically verifying the code and producing metrics on the health of their code. Metrics and verification are used as a means to constantly improve the quality of the code. Even now, however, that entire process is focused on serial code only. It has been shown in this paper that developing tests for and verifying the quality of concurrent code can be integrated into this complex development environment as smoothly as the tools for serial code.
Future Work
It has been shown that there are more effective algorithms for finding bugs than what FindBugs and IBM ConTest can provide (Eytani, et. al.). However, these algorithms are not yet embedded in tools that are useful for developers outside of academia. Some advanced industry tools (Dern & Tan) do make use of some of these algorithms, but there is little movement in this area in the Java ecosystem. Further, synchronization coverage is a straightforward and useful tool that could be integrated with serial code coverage seamlessly. There is, however, no tool known to the author other than IBM ConTest that implements synchronization coverage metrics for Java code. It should be feasible, and is certainly desirable, that synchronization coverage be an additional part of existing code coverage analysis tools like Cobertura and Emma.
Bibliography
Ayewah, N., et. al. (2007). “Using FindBugs on Production Software”. OOPSLA’07, October 21–25. Montréal, Québec, Canada.
Beck, K. (2003). Test-Driven Development by Example. Upper Saddle River, NJ: Addison Wesley.
Bron, A., et. al. (2005). “Applications of Synchronization Coverage”. PPoPP’05, June 15–17. Chicago, Illinois, USA.
Dern, C., Tan, R. (2009). “Code Coverage for Concurrency”. MSDN Magazine from http://msdn.microsoft.com/en-us/magazine/ee412257.aspx.
Duvall, P., Matyas, S., Glover, A. (2007). Continuous Integration: Improving Software Quality and Reducing Risk. Upper Saddle River, NJ: Addison Wesley.
Edelstein, O., et. al. (2008). Automating the Testing of Multi-threaded Java Programs. IBM Research Laboratory in Haifa
Eytani, Y. et. al. (2006). “Towards a framework and a benchmark for testing tools for multi-threaded programs”. Concurrency Computat.: Pract. Exper. 2007; 19:267–279.
Fowler, M. (1999). Refactoring Improving the Design of Existing Code. Upper Saddle River, NJ: Addison-Wesley Professional.
Glover, A. (January, 2006). “In pursuit of code quality: Don’t be fooled by the coverage report”. IBM DeveloperWorks from http://www.ibm.com/developerworks/java/library/j-cq01316/.
Goetz, B., et. al. (2006). Java Concurrency in Practice. Upper Saddle River, NJ: Addison-Wesley.
Hovemeyer, D., Pugh, W. (2004). “Finding Bugs is Easy”. OOPSLA’04, Oct. 2428. Vancouver, British Columbia, Canada.
Miller, J., Maloney, C. (February 1963). “Systematic mistake analysis of digital computer programs”. Communications of the ACM. New York, NY, USA.
Pugh, W., Ayewah, N. (2007). Unit testing concurrent software. IEEE/ACM International Conference on Automated Software Engineering, Atlanta, GA, USA.
Watts, N. (March, 2011). “A Survey of Methods and Tools for Testing Parallel and Concurrent Programs”. Written for Comp 674 at Franklin University.
[1] These strings will only reveal the most obvious multi-threaded code. There are other much trickier situations which have been left out of the scope of this paper (Goetz).
[2] The dashboard has been created from the FindBugs output for a real production code base developed at Ohio Mutual Insurance Group.
July Issue of NFJS, the Magazine published.
Here’s what is in this month’s NFJS, the Magazine
Raju Gandhi – On Eloquent Conversations Part 1
It goes without saying that an enterprise consists of many moving parts, with multiple applications that serve to support different business processes. These applications rarely live in a silo, and consequently need to be integrated to allow for reliable and in some cases, secure data transfer. In this two-part article series, we will discuss some of the hurdles to integration and some possible approaches. We will then turn our attention mainly to messaging. We will also take a look at Spring Integration, a library from Spring Source that lets us integrate our applications in an unobtrusive and declarative manner.
Peter Bell – MongoDB – Why and How?
Many NoSQL data stores are designed primarily to solve problems of scale. Unusually, Mongo can be a great fit for building web applications whether you need to scale or not. In this article we’ll look at why you might consider Mongo, and how to get started with it.
Daniel Hinojosa – Simple and Easy Guide to Types in Scala
There is much to love about Scala. One of the harder things to digest are types. Scala is a static typed language. Thankfully, much of the generics needed as a user are hidden. Those who have dealt with variance in Java will encounter some slight twists when working in Scala, especially when type inference and implicit function parameters are included. Many Scala books on the market do an excellent job covering variance, but the interplay between type inference and variances often lacks appropriate coverage. This article aims to apply some mental spackle to solidify the understanding of types in Scala.
Nathaniel Schutta – Ajax Library Smackdown: Dojo vs. YUI
Ajax is everywhere, from the local newspaper to sites that the CEOs surf. Contrary to popular belief, it isn’t rocket science, especially with the right library. Explore the popular YUI and Dojo libraries, and learn how they can simplify typical Ajax techniques and make JavaScript easier to work with. Discover why you should use a library in the first place and how to choose among libraries, and get some specific examples from YUI and Dojo.
Here’s what is in next month’s NFJS, the Magazine
Venkat Subramaniam – Scala Traits Part 1
In this article, we’ll learn about traits, how it is weaved in at compile time, and how to easily implement the decorator pattern with it.
Craig Walls – NoXML : Spring for XML Haters
When many think of Spring, they think of dependency injection, aspect-oriented programming, declarative transactions…and lots of XML. XML has fallen out of favor with much of the development community and with its heavy use of XML, Spring doesn’t seem as fresh as it once was.
In spite of its angle bracket-laden history, recent versions of Spring offer many non-XML configuration options. In this article we’ll explore some new features in Spring 3.0 and Spring 3.1 that can dramatically reduce or even eliminate XML from your Spring applications.
Raju Gandhi – On Eloquent Conversations Part 2
In the first installment of this series, we discussed the need for integration, and some of the potential pitfalls, especially when attempting to roll your own integration system. We then proceeded to discuss some of the patterns in Gregor Hohpe’s and Bobby Woolf’s aptly named Enterprise Integration Patterns and their corresponding implementations in Spring Integration. We discussed the core patterns that make up the founding blocks of Spring Integration – “Message Channel”, “Message” and “Message Endpoint”. In this article we will explore a few more patterns that will allow you to route, filter and manipulate messages as well as talk to external systems. We will learn how to do this while leveraging Spring’s declarative model that lets you focus on your domain, and lets Spring Integration handle the specifics of messaging.
Scott Leberknight – HBase
This article will examine HBase, a non-relational database designed to scale horizontally while still providing real-time, random read/write access to your data.
I’m very proud of the work we do on this new magazine. The staff and I have worked hard to produce a top-notch magazine that is unique in the realm of software development magazines. The magazine costs $50 per year, which includes 10 issues. Each issue has at least four articles. You can download in a print-quality PDF and two mobile formats: EPUB (for the Nook and iPad) and MOBI (for the Kindle). The articles are professionally edited and are written by top experts in their field, so the content is worth well more than the $50 you pay.
The June issue just published this morning and you can subscribe here: http://bit.ly/fETp6d. As always, if you have questions just comment on this post and I’ll respond quickly.
Session and Clustered Java Web Apps
Session can be a headache to work with in Java web applications. For that reason, most developers now use MVC frameworks, such as Java Server Faces, that hide the use of session and allow you to work with simple Java beans and configuration instead. But, that’s not always the case, especially where you have plain servlets that you have to maintain. With servlets, you have the power to put objects into and take objects out of session yourself. Alone, this presents thread-safety issues, but manually managing session in an application that is clustered presents more problems.
There is a jump that has to be made when moving from a single server environment to a clustered one. In a clustered environment, session management becomes complex because it has to be shared by all the servers in a cluster. This sharing of session becomes a distribution of objects strategy, which is an advanced topic you should familiarize yourself with before clustering your application. A high-end application server (i.e. IBM WebSphere) will handle most of this complexity for you, but you have to educate yourself on the myriad configuration options and their affect on your application.
In my recent excursion into running an existing Java web app in a clustered environment, I found that there was one major stumbling block from my programmer point-of-view—understanding the mechanism for distributing session objects between servers in a cluster. If you’re about to do the same for your application, the first thing to understand is that distribution of session objects is done through Java object serialization. That method is prescribed in the Servlet specification in section “SRV.7.7.2 Distributed Environments”, in fact. How application servers actually manage that so session is consistent is left up to the vendors and is one way in which they can compete.
Understanding that object serialization is the mechanism for sharing session is important to know. To start, it means that every single object that gets put into session has to do two things: 1.) it has to implement the class java.io.Serializable and 2.) it has to actually be serializable. An object is not really serializable unless its entire object graph is also serializable. So think about this, if you didn’t design your application with this requirement in mind, you could be in serious trouble when moving to a cluster. At the very least, you will have to identify all objects you put into session in your servlets and modify them to implement java.io.Serializable (which doesn’t require you to override any methods). At the worst, you will have to write custom serialization code for objects that don’t serialize naturally. The transient keyword can help with this by marking fields of objects that should be skipped during serialization.
The bright side to this discussion is that there is ample documentation out there on the topic, although it is spread out. Below I have annotated a few sources I found very helpful when I started researching this topic. I am working with an IBM WebSphere cluster, so most of the documentation is from IBM, however, all of the advice applies to any application server. Moving to a clustered environment is probably harder work than you realized, so you’ll be well served (and appreciated by your boss) if you check out these resources before making the jump to a clustered environment.
I’m still actively researching this topic myself. As such, I’ll keep updating this post as I learn more (read: I might be wrong on a detail or two
).
Best practices for using HTTP sessions
This is IBM’s advice on the best way to handle various aspects of session. It’s broadly applicable advice that, honestly, I wish I would have seen about 5 years ago when I first started working with web apps
Java Theory and Practice: State Replication in the Web Tier
http://www.ibm.com/developerworks/java/library/j-jtp07294/index.html
This is an older article by Brian Goetz that contains similar knowledge to the previous link.
WebSphere Application Server V7 Administration and Configuration Guide
http://www.redbooks.ibm.com/redbooks/pdfs/sg247615.pdf
Overall, this book is very WebSphere specific. However, Chapter 12 does describe how WebSphere handles distributed session and it’s very enlightening to read. On a guess, I’d say that other app servers do things similarly. There is actually a good bit of general discussion in Chapter 12, much more than there is explanation of what buttons to click to set it up so it’s worth the time to skim.
Designing and Coding Applications for Performance and Scalability in WebSphere Application Server
http://www.redbooks.ibm.com/redbooks/pdfs/sg247497.pdf
This book is far less specific to WebSphere than it sounds. Chapter 3 “General coding considerations” is a fantastic overview of how to deal with topics such as garbage collection, synchronization and database access. Most of the other chapters deal with issues that are similarly applicable to any application server, not just WebSphere.
The big bonus for downloading this RedBook is that it has a section called “Cluster considerations”. This section walks you through all of the things you should consider when you want to run your application on a cluster of servers.
Serializing Access to Session
This link explains how to serialize access to objects in session for IBM WebSphere application server 7.0. I threw this link in to point out that session is basically a shared memory that multiple threads have access to concurrently. This is true whether you are running on a single server or a cluster of servers. This document is specific to WebSphere, but I’m guessing the other commercial app servers have similar options.





