ApacheCon EU 2012
Apache 2.4
- mpm_event new default. at least in vanilla dist. but not in many linux distribution
- correlation id. log format %L. correlate access/error log
- much more powerful expression parser in config.
- mod_define in core
Policing the RFC: How Not To Kill Your Website at Scale
- shows some bad practices first. not setting content-type, no-cache, Vary: User-Agent..
- ! read up on Vary header
- mod_policy validates responses.
- caches might care about content-length. might want to decide if the entity fits the available space
- etags in quotes
- mod_policy can either log violations or kill the request (500)
- enforce http 1.1 on web service apis because in 1.0 keep-alive defaults to off and many client libraries default to 1.0.
- mod_cache diagnostics: X-Cache: HIT from …, X-Cache-Detail: “long description” from ….
- Cache Detail Header
- tl;dr: just be rfc compliant (mod_policy) and use mod_cache
Apache Traffic Server
- proxy
- event based, multi threaded
- Number of threads automatically scale according to system capacity (# cpus, # disks)
- Cache circumvents the file system. argues: cannot run out of inodes. (? when do you run out of inodes??? e.g. ext3 default # inodes int 1TB volume: 134 mio. avg file size 8kb)
- use case: CDN.. blah
ElasticSearch in production – lessons learned
- elasticsearch consulting company
- automatic field type recognition (also e.g. dates in strings)
- versioning, keeps exact copy of source document
- faceting, statistical aggregates
- geo queries
- schemaless? wrong type inference can be a hindrance because after detection it is strict about the type
- manages memory outside of the heap. so don’t just increase -Xmx. could unexpectedly lead to OutOfMemory
- use ES as sole database? he believes, it’s feasible. but keep in mind it’s optimized for querying not for fast data input. Twitter comment
- no multi table transactions in ES but that problem tends to go away when you think document oriented.
- solr or ES? ES: easier configuration, automatically maps documents to lucene. automatic horizontal scaling. solr: well known…
- need to decide in advance how many shards you need. increasing # of shards requires manual work
- search evolution: lucene + analyzers + geospatial indexes, faceting => solr + auto document mapping + sub document queries + replication + => ES
- drawbacks: changing sharding in a full index non-trivial
- when they re-index stuff, they create a secondary cluster, move data there, when finished, switch at load balancer
Cloudstack
- hyper visor agnostic iaas cloud management donated by citrix
Apache Tika
Tomcat 8 Preview
### Servlet 3.1
* non blocking io
* http upgrade (websockets)
* change session id on authentication: protection against session fixation. already in tc but will be standard feature as of servlet api 3.1
* misc clarifications
* security constraints are method specific: jsps respond to anything by default. what to do about this is under discussion
* overlays dropped. !what are maven overlays?
* Expression Language 3 contains lambdas
* tc7 supports websockets via proprietary api. standard api might be backported to it.
Tomcat 8
- nio connector is default
- release at some point after java ee 7 which is scheduled for spring 2013
- “most people don’t do hot deployments but shut down, deploy, restart”
- question: multi version deployment in standard? no but in tc7
OSGi for mere mortals
CouchDB growing up
- couch db is still alive, just the project lead left.
Solr vs. ElasticSearch
- Currently Solr is based on a more recent version of Lucene (3.6.1 vs 4.0)
Managing Installations and Provisioning of OSGi Applications
- what’s an osgi http bridge? osgi web apps run as standalone with http service or with http bridge
Unit- and Integration Testing with Maven
- common test code in multi module projects: test-jar (see jar plugin). on the other side
<dependency><type>test-jar</type></dependency>
- or just create a module with common test code.. how insightful..
- if you want to separate integration tests from unit tests, you could write lots of configuration to add source and target folders
- recommendation: separate module for integration tests
World of Logging
- says logback may lose messages, log4j leaks memory (sometimes)
- log4j, logback, juli, tinylog, avsl
- log4j 2 is in the works
- juli: not fun
- tinylog, avsl: the new shit
- slf4j: modern api. commons logging: old, smelly
- want apache licensed logging abstraction layer (i.e. new commons logging)
future of log4j 2
- now supports format strings
- can monitor configuration file for changes
- performance: log4j1: 2315ns, log4j2: 2386ns, logback: 2116ns
- when disabled, they’re all pretty quick: ~5ns
- check out chainsaw log viewer (maybe dev version)
Logger logger = Logger.getLogger("foo")
is tedious, wouldn’t @Inject Logger logger
be nicer? => apache mayhem
- check out Apache Mayhem, logstash
Faster builds with Buildr
- First ruby project @ ASF
- Since 2007
- developed for Apache ODE, enterprisey BEPL engine
- Why? 8kloc POM
What’s to like about Maven?
- Central repository. Nothing like it in the Java world.
- standard, every project builds, broad acceptance
What isn’t?
- not scriptable. 30 lines for concatenating two text files
- heavy weight extension mechanism
Buildr goals
- scripting
- keep the best of maven, i.e. maven central etc.
- no xml
How does Buildr work?
-- Matthias Schütz 06 Nov 2012