Ignored By Dinosaurs 🦕

This is an interweaving of Four Kitchens' Varnish 3 VCL and this generic Varnish 4 VCL.


vcl 4.0;
# Based on: https://github.com/mattiasgeniar/varnish-4.0-configuration-templates/blob/master/default.vcl

import std;
import directors;

backend server1 { # Define one backend
 .host = "127.0.0.1"; # IP or Hostname of backend
 .port = "8080"; # Port Apache or whatever is listening
 .max_connections = 300; # That's it

 .probe = {
 #.url = "/"; # short easy way (GET /)
 # We prefer to only do a HEAD /
 .request =
 "HEAD / HTTP/1.1"
 "Host: localhost"
 "Connection: close"
 "User-Agent: Varnish Health Probe";

 .interval = 5s; # check the health of each backend every 5 seconds
 .timeout = 1s; # timing out after 1 second.
 .window = 5; # If 3 out of the last 5 polls succeeded the backend is considered healthy, otherwise it will be marked as sick
 .threshold = 3;
 }

 .first_byte_timeout = 300s; # How long to wait before we receive a first byte from our backend?
 .connect_timeout = 5s; # How long to wait for a backend connection?
 .between_bytes_timeout = 2s; # How long to wait between bytes received from our backend?
}

/\*acl purge {
 # ACL we'll use later to allow purges
 "localhost";
 "127.0.0.1";
 "::1";
}\*/


/\*acl editors {
 # ACL to honor the "Cache-Control: no-cache" header to force a refresh but only from selected IPs
 "localhost";
 "127.0.0.1";
 "::1";
}\*/

sub vcl_init {
 # Called when VCL is loaded, before any requests pass through it.
 # Typically used to initialize VMODs.

 new vdir = directors.round_robin();
 vdir.add_backend(server1);
 # vdir.add_backend(server...);
 # vdir.add_backend(servern);
}

sub vcl_recv {
 # Called at the beginning of a request, after the complete request has been received and parsed.
 # Its purpose is to decide whether or not to serve the request, how to do it, and, if applicable,
 # which backend to use.
 # also used to modify the request
 call ban_list;
 #set req.url = std.tolower(req.url);

 set req.backend_hint = vdir.backend(); # send all traffic to the vdir director

 # Normalize the header, remove the port (in case you're testing this on various TCP ports)
 set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");

 # Normalize the query arguments
 set req.url = std.querysort(req.url);

 # Allow purging
 if (req.method == "PURGE") {
/\* if (!std.ip(req.http.X-Forwarded-For, "0.0.0.0") ~ purge) { # purge is the ACL defined at the begining
 # Not from an allowed IP? Then die with an error.
 return (synth(405, "This IP - " + std.ip(req.http.X-Forwarded-For, "0.0.0.0") + " is not allowed to send PURGE requests."));
 }\*/
 # If you got this stage (and didn't error out above), purge the cached result
 return (purge);
 }

 # Only deal with "normal" types
 if (req.method != "GET" &&
 req.method != "HEAD" &&
 req.method != "PUT" &&
 req.method != "POST" &&
 req.method != "TRACE" &&
 req.method != "OPTIONS" &&
 req.method != "PATCH" &&
 req.method != "DELETE") {

 return (pipe);
 }

 if (req.url ~ "^/status\.php$" ||
 req.url ~ "^/update\.php$" ||
 req.url ~ "^/admin$" ||
 req.url ~ "^/admin/.\*$" ||
 req.url ~ "^/flag/.\*$" ||
 req.url ~ "^.\*/ajax/.\*$" ||
 req.url ~ "^.\*/ahah/.\*$") {
 return (pass);
 }

 # Implementing websocket support (https://www.varnish-cache.org/docs/4.0/users-guide/vcl-example-websockets.html)
 if (req.http.Upgrade ~ "(?i)websocket") {
 return (pipe);
 }

 # Drupal's batch mode will behave in a funky manner since all cookies except
 # for the session get stripped out below. This makes batch fall into 
 # op=do_nojs mode, which isn't really needed. Just get Varnish out of the way.
 if (req.url ~ "(^/batch)") {
 return (pipe);
 }

 # Only cache GET or HEAD requests. This makes sure the POST requests are always passed.
 if (req.method != "GET" && req.method != "HEAD") {
 return (pass);
 }

 # Strip hash, server doesn't need it.
 if (req.url ~ "\#") {
 set req.url = regsub(req.url, "\#.\*$", "");
 }

 # Strip a trailing ? if it exists
 if (req.url ~ "\?$") {
 set req.url = regsub(req.url, "\?$", "");
 }

 # Some generic cookie manipulation, useful for all templates that follow
 # Remove the "has_js" cookie

 if (req.http.Cookie) {
 # 1. Append a semi-colon to the front of the cookie string.
 # 2. Remove all spaces that appear after semi-colons.
 # 3. Match the cookies we want to keep, adding the space we removed
 # previously back. (\1) is first matching group in the regsuball.
 # 4. Remove all other cookies, identifying them by the fact that they have
 # no space after the preceding semi-colon.
 # 5. Remove all spaces and semi-colons from the beginning and end of the
 # cookie string.
 set req.http.Cookie = ";" + req.http.Cookie;
 set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";"); 
 set req.http.Cookie = regsuball(req.http.Cookie, ";(SESS[a-z0-9]+|SSESS[a-z0-9]+|NO_CACHE)=", "; \1=");
 set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]\*", "");
 set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
 
 if (req.http.Cookie == "") {
 # If there are no remaining cookies, remove the cookie header. If there
 # aren't any cookie headers, Varnish's default behavior will be to cache
 # the page.
 unset req.http.Cookie;
 }
 else {
 # If there is any cookies left (a session or NO_CACHE cookie), do not
 # cache the page. Pass it on to Apache directly.
 return (pass);
 }
 }

 if (req.http.Cache-Control ~ "(?i)no-cache") {
 #if (req.http.Cache-Control ~ "(?i)no-cache" && client.ip ~ editors) { # create the acl editors if you want to restrict the Ctrl-F5
 # http://varnish.projects.linpro.no/wiki/VCLExampleEnableForceRefresh
 # Ignore requests via proxy caches and badly behaved crawlers
 # like msnbot that send no-cache with every request.
 if (! (req.http.Via || req.http.User-Agent ~ "(?i)bot" || req.http.X-Purge)) {
 #set req.hash_always_miss = true; # Doesn't seems to refresh the object in the cache
 return(purge); # Couple this with restart in vcl_purge and X-Purge header to avoid loops
 }
 }

 # Large static files are delivered directly to the end-user without
 # waiting for Varnish to fully read the file first.
 # Varnish 4 fully supports Streaming, so set do_stream in vcl_backend_response()
 if (req.url ~ "^[^?]\*\.(7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm|opus|rar|tar|tgz|tbz|txz|wav|webm|xz|zip)(\?.\*)?$") {
 unset req.http.Cookie;
 return (hash);
 }

 # Remove all cookies for static files
 # A valid discussion could be held on this line: do you really need to cache static files that don't cause load? Only if you have memory left.
 # Sure, there's disk I/O, but chances are your OS will already have these files in their buffers (thus memory).
 # Before you blindly enable this, have a read here: https://ma.ttias.be/stop-caching-static-files/
 if (req.url ~ "^[^?]\*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.\*)?$") {
 unset req.http.Cookie;
 return (hash);
 }

 # Send Surrogate-Capability headers to announce ESI support to backend
 set req.http.Surrogate-Capability = "key=ESI/1.0";

 if (req.http.Authorization) {
 # Not cacheable by default
 return (pass);
 }

 return (hash);
}

sub vcl_pipe {
 # Called upon entering pipe mode.
 # In this mode, the request is passed on to the backend, and any further data from both the client
 # and backend is passed on unaltered until either end closes the connection. Basically, Varnish will
 # degrade into a simple TCP proxy, shuffling bytes back and forth. For a connection in pipe mode,
 # no other VCL subroutine will ever get called after vcl_pipe.

 # Note that only the first request to the backend will have
 # X-Forwarded-For set. If you use X-Forwarded-For and want to
 # have it set for all requests, make sure to have:
 # set bereq.http.connection = "close";
 # here. It is not set by default as it might break some broken web
 # applications, like IIS with NTLM authentication.

 # set bereq.http.Connection = "Close";

 # Implementing websocket support (https://www.varnish-cache.org/docs/4.0/users-guide/vcl-example-websockets.html)
 if (req.http.upgrade) {
 set bereq.http.upgrade = req.http.upgrade;
 }

 return (pipe);
}

sub vcl_pass {
 # Called upon entering pass mode. In this mode, the request is passed on to the backend, and the
 # backend's response is passed on to the client, but is not entered into the cache. Subsequent
 # requests submitted over the same client connection are handled normally.

 # return (pass);
}

# The data on which the hashing will take place
sub vcl_hash {
 # Called after vcl_recv to create a hash value for the request. This is used as a key
 # to look up the object in Varnish.

 hash_data(req.url);

 if (req.http.host) {
 hash_data(req.http.host);
 } else {
 hash_data(server.ip);
 }

 # hash cookies for requests that have them
 if (req.http.Cookie) {
 hash_data(req.http.Cookie);
 }
}

sub vcl_hit {
 # Called when a cache lookup is successful.

 if (obj.ttl >= 0s) {
 # A pure unadultered hit, deliver it
 return (deliver);
 }

 # https://www.varnish-cache.org/docs/trunk/users-guide/vcl-grace.html
 # When several clients are requesting the same page Varnish will send one request to the backend and place the others on hold while fetching one copy from the backend. In some products this is called request coalescing and Varnish does this automatically.
 # If you are serving thousands of hits per second the queue of waiting requests can get huge. There are two potential problems - one is a thundering herd problem - suddenly releasing a thousand threads to serve content might send the load sky high. Secondly - nobody likes to wait. To deal with this we can instruct Varnish to keep the objects in cache beyond their TTL and to serve the waiting requests somewhat stale content.

# if (!std.healthy(req.backend_hint) && (obj.ttl + obj.grace > 0s)) {
# return (deliver);
# } else {
# return (miss);
# }

 # We have no fresh fish. Lets look at the stale ones.
 if (std.healthy(req.backend_hint)) {
 # Backend is healthy. Limit age to 10s.
 if (obj.ttl + 10s > 0s) {
 #set req.http.grace = "normal(limited)";
 return (deliver);
 } else {
 # No candidate for grace. Fetch a fresh object.
 return(miss);
 }
 } else {
 # backend is sick - use full grace
 if (obj.ttl + obj.grace > 0s) {
 #set req.http.grace = "full";
 return (deliver);
 } else {
 # no graced object.
 return (miss);
 }
 }

 # fetch & deliver once we get the result
 return (miss); # Dead code, keep as a safeguard
}

sub vcl_miss {
 # Called after a cache lookup if the requested document was not found in the cache. Its purpose
 # is to decide whether or not to attempt to retrieve the document from the backend, and which
 # backend to use.

 return (fetch);
}

# Handle the HTTP request coming from our backend
sub vcl_backend_response {
 # Called after the response headers has been successfully retrieved from the backend.
 # set beresp.http.X-Backend = beresp.backend.name;

 # Pause ESI request and remove Surrogate-Control header
 if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
 unset beresp.http.Surrogate-Control;
 set beresp.do_esi = true;
 }

 # Enable cache for all static files
 # The same argument as the static caches from above: monitor your cache size, if you get data nuked out of it, consider giving up the static file cache.
 # Before you blindly enable this, have a read here: https://ma.ttias.be/stop-caching-static-files/
 if (bereq.url ~ "^[^?]\*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.\*)?$") {
 unset beresp.http.set-cookie;
 }

 # Large static files are delivered directly to the end-user without
 # waiting for Varnish to fully read the file first.
 # Varnish 4 fully supports Streaming, so use streaming here to avoid locking.
 if (bereq.url ~ "^[^?]\*\.(7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm|opus|rar|tar|tgz|tbz|txz|wav|webm|xz|zip|csv)(\?.\*)?$") {
 unset beresp.http.set-cookie;
 set beresp.do_stream = true; # Check memory usage it'll grow in fetch_chunksize blocks (128k by default) if the backend doesn't send a Content-Length header, so only enable it for big objects
 set beresp.do_gzip = false; # Don't try to compress it for storage
 }

 # Sometimes, a 301 or 302 redirect formed via Apache's mod_rewrite can mess with the HTTP port that is being passed along.
 # This often happens with simple rewrite rules in a scenario where Varnish runs on :80 and Apache on :8080 on the same box.
 # A redirect can then often redirect the end-user to a URL on :8080, where it should be :80.
 # This may need finetuning on your setup.
 #
 # To prevent accidental replace, we only filter the 301/302 redirects for now.
 if (beresp.status == 301 || beresp.status == 302) {
 set beresp.http.Location = regsub(beresp.http.Location, ":[0-9]+", "");
 }

 # Set 2min cache if unset for static files
 if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Vary == "\*") {
 set beresp.ttl = 120s; # Important, you shouldn't rely on this, SET YOUR HEADERS in the backend
 set beresp.uncacheable = true;
 return (deliver);
 }

 # Don't cache 50x responses
 if (beresp.status == 500 || beresp.status == 502 || beresp.status == 503 || beresp.status == 504) {
 return (abandon);
 }

 # Allow stale content, in case the backend goes down.
 # make Varnish keep all objects for 6 hours beyond their TTL
 set beresp.grace = 6h;

 return (deliver);
}

# The routine when we deliver the HTTP request to the user
# Last chance to modify headers that are sent to the client
sub vcl_deliver {
 # Called before a cached object is delivered to the client.

 if (obj.hits > 0) { # Add debug header to see if it's a HIT/MISS and the number of hits, disable when not needed
 set resp.http.X-Cache = "HIT";
 } else {
 set resp.http.X-Cache = "MISS";
 }

 # Please note that obj.hits behaviour changed in 4.0, now it counts per objecthead, not per object
 # and obj.hits may not be reset in some cases where bans are in use. See bug 1492 for details.
 # So take hits with a grain of salt
 set resp.http.X-Cache-Hits = obj.hits;

 # Remove some headers: PHP version
 unset resp.http.X-Powered-By;

 # Remove some headers: Apache version & OS
 unset resp.http.Server;
 unset resp.http.X-Drupal-Cache;
 unset resp.http.X-Varnish;
 unset resp.http.Via;
 unset resp.http.Link;
 unset resp.http.X-Generator;

 return (deliver);
}

sub vcl_purge {
 # Only handle actual PURGE HTTP methods, everything else is discarded
 if (req.method != "PURGE") {
 # restart request
 set req.http.X-Purge = "Yes";
 return(restart);
 }
}

sub vcl_synth {
 if (resp.status == 720) {
 # We use this special error status 720 to force redirects with 301 (permanent) redirects
 # To use this, call the following from anywhere in vcl_recv: return (synth(720, "http://host/new.html"));
 set resp.http.Location = resp.reason;
 set resp.status = 301;
 return (deliver);
 } elseif (resp.status == 721) {
 # And we use error status 721 to force redirects with a 302 (temporary) redirect
 # To use this, call the following from anywhere in vcl_recv: return (synth(720, "http://host/new.html"));
 set resp.http.Location = resp.reason;
 set resp.status = 302;
 return (deliver);
 }

 return (deliver);
}


sub vcl_fini {
 # Called when VCL is discarded only after all requests have exited the VCL.
 # Typically used to clean up VMODs.

 return (ok);
}

#varnish

So I gave a presentation at Drupaldelphia a few weeks ago about the Paragraphs module.

The Paragraphs module is my favorite Drupal module that I've come across in probably the last 5 years. It's basically Drupal's implementation of the concept of “structured content” – one of those terms that sounds so abstract that you probably feel an unconscious repulsion to even learning more about the idea, but hopefully I can help get you over that.


The problem

The problem is the dreaded *body field*. The body field is (historically) basically the dumping ground for everything that is going into a piece of content on the website. For sites like this blog, made up of 99.8% text, it works fabulously well and I suspect that in the early days of blogging and the internet most content that went into some kind of CMS was modeled in this way. You're reading a body field right now. There were undoubtedly some images placed in with the text, but anything really fancy or custom was most likely coded by hand, outside of the CMS.

Things went this way for a number of years and as CMSs like Drupal and Wordpress continued to gain popularity and more and more people began to use them to run their websites, more and more “things” began to wander into the body field. I'd very much like to add some images to this post, for example, but it's actually kind of a PITA to do it in a reliable way.

One day some dudes invented a website where the whole world could post and share videos, then they let you embed those videos into other web pages. So now the body field has to accommodate text, images, and video embeds.

The slideshow was born. “Why can't I put a slideshow into my article?!” became a battle cry from legions of downtrodden District 12 editors. “Imgur lets me create slideshows!”

“Data journalism” comes along, and with it a thousand fancy infographics from your internal production teams and 3rd party tools alike, distributed via iframes and js snippets and holy shit letting our users embed javascript is suicide, right??

The Twitter card embed. I'll stop there.

Soundcloud. Every other media site with their own custom video player. Imgur. Flickr. Hubspot. Disqus.

Some (crappy) solutions

This is a problem for a number of reasons. The most immediate issue that this causes is that unless all your editors know how to write perfect HTML, you're going to be stuck with The Wysiwyg. Wysiwygs have come a pretty long way in the last couple years (a few of them anyway), but I don't know of any serious Wysiwyg solution out there that is able to keep pace with the number of new “things” showing up on the internet. Our editors want to put these things in their content in a way that will effectively keep them from breaking the site, and it's our job to give that to them somehow.

The most evolved solution to this problem is the one that Wordpress came up with, and Jeff Eaton espoused last year in his DrupalCon Talk “The Battle for the Body Field”, basically – shortcodes. This approach allows for a lot of editor creativity which should be a primary goal of our solution, but puts some guardrails up so that we're not constantly fielding tickets about a broken article.

So to recap, here are the most commonly employed solutions to this problem

  • Don't let them put anything in there (Markdown)
  • Let them put everything in there (HTML)
  • Let them put almost anything in there, but try and keep them from blowing our leg off (shortcodes)

And yet

None of these addresses a fundamental thing that we should care about – reuse. Once you put something in the body field, it's essentially in the content roach motel, and it's never checking out. Your system can't have any awareness of what's inside that field, so unless someone manages to get to exactly that article where you used that image or that tweet, it's never going to be seen again.

There is another way though. Imagine being able to create a feed of images that were used in articles on your site that day. Imagine being able to grab all the twitter cards that were used in articles that were tagged to Cats. Or being able to easily add rich, multi-field captions to images without having to bend over backwards.

Structured Content

So if you take a step back and think about it, a piece of content on your website is often a fairly unstructured piece of work, but it can be broken down into a collection of pieces that are themselves very structured.

Take an image with caption. Trying to do this in the Wysiwyg frequently involves adding the caption to either the title or alt attribute and then using javascript to pull that out, build a DOM element out of it, and insert it somewhere in the vicinity of the image. What happens if you also need an attribution field in addition to the caption, though? That's the instant things start getting weird, and often we give the editor some unsatisfactory answer and they slink off to solve the issue in some unsatisfactory way.

But really, what if we treat that image/caption as it's own entity? Then you have an entity with an image field and a caption field. If you want to add an attribution field, that's very easy in this model – you just add an attribution field. Or a URL. Or a date.

Something with a few more moving parts – how about that image gallery? Well, another entity for starters, but make it so you can add any number of images to the entity and presto. Since our system is aware of the kind of entity that you're using here, it's trivial to wrap it in the CSS classes needed to pull off an image gallery.

So essentially, rather than your content being something like Title/Summary/Body/Image for this/Image for that you end up with something more like Title/Summary/Collection of individual entities that make up the body of the article. Those individual entities are pretty easy to manage in themselves, since they're highly predictable. You just need some mechanism for relating them into the article that they live in and making sure they display in the right order. Once you do that though, you're not bound strictly by the article model anymore. You can use those entities in other ways as well.

This article got waaaay longer than I intended, so I'll get into Drupal's answer to this issue in the next one. As far as I know, this concept has existed in the CMS world for a very long time, but Drupal is the only platform that I know of that actually has an implementation of this concept in the Paragraphs module. Until then.

#workflow #drupal

See the previous chapter on installing Drupal.


Hi there, and congrats on making it this far. You should be looking at a screen that looks like this —

Drupal Welcome Screen

Congrats, you've just set up a website with arguably the most advanced CMS in the world!


Creating Content

This is what a Content Mnagament System is for after all. If you followed the “standard install”, then you have a couple of different choices. In Drupal parlance, these are called “Content Types” and you have two of them so far – Basic Page and Article.

So what are content types?

I like to think of content types as the “things” on your website. These can be articles on a blog site, items in your catalog on an e-commerce site, pets for adoption on the local SPCA site, or really anything that you want to post on your site. In Drupal parlance we call them “Nodes”. Node was chosen for it's deliberate vagueness, just go with it for now. Those different types of content that you want to put on your site – blog posts, pets – are appropriately known as “content types”. You can make as many of them as you want, they're just one way to categorize your content into the same kind of “thing”.

So let's create a new article. Feel free to explore the admin menu, which should be visible at this point or you can just head to http://localhost:8888/node/add. That's where we get started. I'm assuming no prior web development experience, but I am assuming that you've probably at least uploaded an image or two, and filled in some forms on the web before. That's all you have to do here.

Drupal 8's got some nice new options with some nice new polish for folks who are creating the content, but we're not going to get into that now. Let's get our hands dirty.

The Drupalnoobs conference

This conference will be for newcomers to Drupal and will feature lots of session about all aspects of getting started with Drupal. The sessions will be lead by experienced and established Drupal developers who will present some pretty awesome material. The sessions will be recorded and put up on YouTube somewhere, so after the conference is done we'd like to put the videos up on the website. We'd like to group all the sessions into different tracks like “design” and “business” and “completely new to all of this”.

Naturally, we need a website to hold all of this stuff so that's what we're going to build and hopefully come out on the other side with a bit more grounding in how to get things done with Drupal.

#drupal

See the previous post in this series on getting started with Drupal


So welcome back, this is actually the most challenging part of this tutorial – installing Drupal. I'm assuming no prior web development experience, so the first part will be installing something to run your tutorial Drupal site on.

For this we'll be using a project called MAMP. MAMP is a package of software that makes it easy to set up Drupal (but not just Drupal) on your computer. I'll skip the deeper details for now, but head over to MAMP and download it. You can stick with the free version for now.

note: There are many basically equivalent packages out there for this purpose – XAMPP, Acquia Dev Desktop. If you'd prefer one of those feel free, but I'll be using MAMP because it's very simple and is what I got started with.

What is MAMP?

This section can be safely skipped if you don't care.

MAMP is basically the same package of software that runs on any webhosting server, think Dreamhost or GoDaddy. It stands for (M)y (A)pache, (M)ySQL, and (P)hp. These are the three things that you need to make Drupal work.

Apache is a “webserver”. The webserver piece is the one that sits there and listens for incomcing requests from someone's browser (or bot). When a request comes in from somewhere, Apache figures out what that request is asking for and routes it to the correct item. It could be an image, which is very simple because it's just a file sitting on a disk. In this case, Apache grabs the file and returns it. This is called “the Request/Response” cycle, and is pretty much the slab that the internet was built on.

Sometimes however, the request is asking for a webpage in Drupal. This case is a bit more complex, and that's when PHP and MySQL come into play. PHP is a programming language. You might know that already, but Drupal is mostly written in PHP. That's why you need it for this tutorial. When you create a new page on your new site, the content of that page gets stored not as a file on a disk, but as a row in a database. MySQL is an exceptionally popular database, and the one that we'll us for this tutorial.

On to the show

Hopefully MAMP has finished downloading by this point. Go through the installer, and when you get done you should be able to open MAMP in the same way that you'd open any other application. Once it starts up, you should have a screen that looks basically like this —

Mamp opening screen

Once you click “Start Servers” you've done it! You've built your first webserver stack!

Click into “Preferences”.

Under the preferences option, you'll get some options to twiddle with. Don't twiddle with them, at least not yet. Option names will vary, but you're looking for the rightmost tab, it's either called “Apache” or “Webserver” in the most recent versions. Under that tab will be a most pertinent piece of information – the “Document Root”.

[!info] Document Root – where the webserver will look for the files that it's trying to serve.

In a nutshell, once we download Drupal, we're going to put all it's files in that directory so make a note of where that directory is!

Downloading Drupal!

Head on over to Drupal.org and download Drupal! At the time of this writing, that giant green button takes you to another screen where you are presented with ah choice. I was hoping to shield my readers from this, but if you're going to learn Drupal I guess now's as good a time as any to explain why this choice exists at all.

You may skip all of this.


An interlude

Drupal has been around for nigh 12 years at this point. It was started in a Dutch kid's dorm room as more or less a message board for that dorm. Early in life it embraced the open source model for development, which means that other kids in his dorm were able to hack on it and add to it and improve on it and make it better for everybody.

Many years later Drupal was running some of the largest websites on the internet, and while it had been added on to and improved by thousands of developers by that point you could still find some of that 12 year old dorm code if you looked in the right places. Many people, your author included, felt disbelief that such code could be responsible for so much, yet at the same time took great comfort and pride that really anyone could learn this stuff just by following this code around. There truly was nothing really fancy about Drupal's codebase for a great many years. A few really smart patterns up front, followed diligently for years, and the rest is early internet history.

But time marches on, and with it evolution. Standards in computer engineering, common patterns for solving common problems, and much more complex needs on the web necessitated engaging with the wider PHP ecosystem. After all, the Easter Islands were once thriving communities, yet after time they thrived themselves right out of existence. Drupal wanted to avoid such a fate, so a decision was made in 2011 to replace some key pieces of Drupal's internal code with more modern code from a well known PHP framework – Symfony.

This made a heck of a lot of sense. Much of Drupal's aforementioned dorm code had very interesting, almost paleological qualities about the way that it solved problems as if “this was how our ancestors built a fire before we had matches”, and newcomers to Drupal that *did* have a background in software development were often left scratching their heads to some of the decisions. In a nutshell, learning Drupal was easier for newcomers to web development than it was for established developers.

Thus the rather controversial decision was made to standardize some of the very deepest parts of Drupal – those dealing with the “request/response” cycle.

Thus began a process that took 5 years and involved an almost complete rewrite of Drupal. This is both shocking and obvious in hindsight, since a complete rewrite is something you never, ever, ever want to do with a software project, yet once you modernize a piece of a system, the rest of the system looks that much more archaic.

The good part – Drupal is a modern and really impressive piece of software engineering, and includes many more features in the standard install that you're going to want on your site than previous versions. It's much more “batteries included” than older versions that required you to download and install lots of add ons to get it to do the things you really wanted it to do.

The bad part – much of the code that has been written by folks like you and me over the last decade doesn't work anymore. This is kinda brutal, but such is evolution. It also opens up something of a goldmine for new development opportunities within the Drupal ecosystem, but with that comes that learning to code for Drupal 8 will be a much different experience if you are new to building software. It'll require you to know what you're doing, which I most certainly didn't when I was learning Drupal (6).

The other good part – this entire tutorial can be done now with Drupal 8.

So go ahead and download Drupal 8, but once you decide that Drupal is, in fact, for you you'll probably revisit this topic.

Back to Drupal

So you've downloaded Drupal 8 – unzip it. You'll have a bunch of files and folders that look like this inside the newly unzipped directory -

Downloads/drupal-8.0.5 [ tree -L 1 ] 4:50 PM
.
├── LICENSE.txt
├── README.txt
├── autoload.php
├── composer.json
├── composer.lock
├── core
├── example.gitignore
├── index.php
├── modules
├── profiles
├── robots.txt
├── sites
├── themes
├── update.php
├── vendor
└── web.config

6 directories, 10 files

All those files go in “The Docroot” – which is the path that you noted earlier in your MAMP preferences under Apache/Webserver/whatever. It'll end in htdocs, so something like /Applications/MAMP/htdocs if you're on a Mac, or whatever that screen says if you're not.

The big payoff

Something always goes funny with people's computers, but at this point you should be able to navigate your browser to localhost:8888 and be greeted with the Drupal installation screen.

Drupal 8 install screen

We're going to be choosing all the defaults for this tutorial, click through the language and the next option is for “installation profile”, just choose Standard.

The next screen – “System Requirements” – is the tricky one. Ask below in the comments and we'll try to debug it together if you aren't allowed through. MAMP should have all this sorted out for you already, though so soldier on.

The next and basically final step is to give Drupal the connection credentials to your MySQL database. Those can be found on the welcome webpage if you click that middle button in MAMP. That'll take you to a screen that tells you for sure, but it should be something like

user: root
pass: root
host: localhost
(open up the advanced options)
port: 8889
(leave the table prefix empty)

At this point, you're in. You've installed Drupal. There is one more configuration screen that you can plug all the answers into on your own.

Save and continue on to the fun part of the tutorial!

#drupal

Was asked this question recently, and haven't done any low level string manipulation w PHP in a little while. Couldn't remember the signature of substr(), but that wasn't my method anyway. Mine was more like iterating over an index, working from the back forward and concat-ing that on to a new string. Also, this is like 5 minutes worth of code, so cut me some slack.


echo "\nString reverse test\n";

function bench_it($func) {
  if (function_exists($func)) {
    $str = file_get_contents(__DIR__ . '/mobydick.txt');
    $results = [];
    $iter = 10;
    for ($i = 0; $i < $iter; $i++) {
      $then = microtime(true);
      $func($str);
      $now = microtime(true);
      $results[] = $now - $then;
    }
    $timeout = array_sum($results) / count($results);

    echo "$func avg: $timeout\n";
  }
}

function substr_test($str) {
  $len = strlen($str);
  $new_str = "";
  while ($len  0) {
    $new_str += substr($str, $len, 1);
    $len--;
  }
}

function str_concat($str) {
  $len = strlen($str);
  $new_str = "";
  while ($len > 0) {
    $new_str += $str[$len - 1];
    $len--;
  }
}

function str_push_and_join($str) {
  $len = strlen($str);
  $arr = [];
  while ($len > 0) {
    $arr[] = $str[$len - 1];
    $len--;
  }
  implode('', $arr);
}

function strrev_test($str) {
  strrev($str);
}
bench_it('substr_test');
bench_it('str_concat');
bench_it('str_push_and_join');
bench_it('strrev_test');
grubb:php/ $ php test.php [18:18:13]

String reverse test

substr_test avg: 1.2850220918655
str_concat avg: 0.24263381958008
str_push_and_join avg: 0.63150105476379
strrev_test avg: 0.00075311660766602

So the method that I was working on (#2 str_concat) is the fastest besides the built in strrev(), but most interesting is when you run these same tests on PHP 7.0.4 —

grubb:php/ $ brew unlink php56 && brew link php70 [18:18:56]
  Unlinking /usr/local/Cellar/php56/5.6.19... 19 symlinks removed
  Linking /usr/local/Cellar/php70/7.0.4... 17 symlinks created
grubb:php/ $ php test.php [18:19:20]

String reverse test

substr_test avg: 0.12836751937866
str_concat avg: 0.084317827224731
str_push_and_join avg: 0.13263397216797
strrev_test avg: 0.00083370208740234

#php

It's an interesting time to be building things for the web. It's matured to a place where much of our day to day lives is conducted through sites that we access online. I personally buy books on paper, diapers, any piece to repair a broken appliance. I search for clues on how to do my job. I read up on the latest state of th technology scene. Every one of these places I go has largely the same stuff on it – some kind of login, some kind of way to get to “my stuff”, some kind of way to make sure that others can't get to my stuff.

Building this stuff is not exactly easy, and to do it in a way that makes it a lot harder for other people to break (or break into) is downright terrifying. So what do we do? We team up. How do we do it? Well, in this day and age it's called “open source software”.

Drupal is just such a piece of software. Right out of the box it comes with all that standard stuff – user accounts, a mechanism for posting stuff online – either publicly or privately, and mechanisms for doing many of the other things that you could think of. Lots of folks use Drupal and have been using Drupal for many years now, which means that for the vast majority of things that you want your site to do someone else has wanted their site to do that too. Even better, they've already solved the problem and given the solution back to us all, so you don't have to go solve it again. This is the essence of open source – folks all over the world working together, and a huge part of why I use Drupal.

What about Tool X?

To be sure, Tool X is excellent. It has a great user community and many of these same problems have been solved in a different way inside it, but it actually requires you to write code, which Drupal doesn't require for you to get started.

Tool W is another project much like Drupal, but it isn't nearly as flexible as Drupal, it stays within a smaller set of lines. It does what it does within those lines excellently, so definitely evaluate if you can solve your case with Tool W before coming to Drupal, as Drupal is a bigger piece of machinery with a steeper learning curve.

Who uses Drupal?

Lots of folks. Right now the best use case for Drupal is for sites that have a lot of content. These include publishers and their sites, government agencies and their sites, and educational insitutions and their sites. Lots of times these folks have specific needs for how they present their content to the world and how they allow the world to access it, and Drupal fits that use case like a glove.

Turns out that most things that go on the web aren't that far away from this specific use case, so extending Drupal to get there is often a pretty straightforward and well trodden path. This series hopes to make it even more straightforward for newcomers.

Why not Drupal?

Drupal is not good at everything, however. Don't use it to ingest real time trade data from the NYSE, for example. You'll want a leaner and more purpose built system for that, but chances are if you're building something like that you already know this.

Now, on to installing Drupal.

#drupal

Hi there, I'm new to Django. I love the contributed ecosystem, but all of the options that I found there for dealing with Markdown were just too heavy. I didn't need a Wysiwyg editor, I just wanted an output filter. As it turns out this is exceptionally easy to do!


Python has a really amazing lib situation, so I just found the smallest python Markdown lib that I could, it's called “mistune”. Do a pip install mistune.

So within your app, let's call it “blog”, create a directory called templatetags. By the way, this is all pretty easy to parse out of their killer documentation. Create a file in there called markdownify.py.

 # blog/templatetags/markdownify.py
from django import template
import mistune
 
register = template.library()
 
@register.filter
def markdown(value):
	markdown = mistune.Markdown()
	return markdown(value)

It is as simple as that. In whatever template you'll actually want to be rendering markdown, you'll need to include this templatetag with

 {% load markdownify %}

at the top of the template. Then you'll just pipe the output that you want to render like you do in every other template lib —-

{{ post.body | markdown | safe }}

The full example of the template that renders this page is here.


But wait, there's more!

How about syntax highlighting? We're programmers after all, and Python just happens to have the great-granddaddy of all syntax highlighting libs in Pygments. I've known of Pygments for years, since it used to be a requirement of one of the Ruby libs to Markdown rendering (if you wanted synta highlighting). In other words, even Ruby leaned on Pygments for a great number of years.

So pip install pygments. Then scroll down the page on the Mistune docs and follow along. You'll be adding some code to the markdownify.py file.

from django import template
import mistune
from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters import HtmlFormatter

register = template.Library()

class HighlightRenderer(mistune.Renderer):
    def block_code(self, code, lang):
        if not lang:
            return f"""
```
{mistune.escape(code)}
```
            """
        lexer = get_lexer_by_name(lang, stripall=True)
        formatter = HtmlFormatter()
        return highlight(code, lexer, formatter)

@register.filter
def markdown(value):
    renderer = HighlightRenderer()
    markdown = mistune.Markdown(renderer=renderer)
    return markdown(value)

That HighlightRenderer class is directly out of the Mistune docs, so thank you Mistune Author! That is seriously all it takes, but you'll need a stylesheet, of which there are plenty. I searched for “pygments stylesheets” and came across this project, so you'll need to pick one of those themes and get it into your project somewhere. By default, the zenburn theme is expecting the wrapper div to have a CSS class of 'codehilite' instead of what it needs – 'highlight', so a quick search and replace and I had syntax highlighting in less than 5 minutes.


*edit Sept 2016*

So once you manage your way through all this, you'll be able to use “fenced code blocks” in your posts. They look like this —

```php
<?php 

function foo() {
 /// ...
}
```

becomes

<?php 

function foo() {
 /// ...
}

You can use either a trio of tildes ~ or backticks ` to open and close one of those code blocks, and I typically just pass the file extension and it generally works. You can also write out the full name of the language.

```py
def method():
    return "foo"
```

becomes

def method():
    return "foo"

Just be advised that it is possible to fatally hose your website if you happen to pass a language for which Pygments doesn't have a “lexer”, meaning that it has no idea how to highlight the syntax of that language. That happened to me with some Varnish config files that I tried to highlight with a .vcl extension on them. I don't remember how I fixed it but I'm pretty sure it required going directly to the database to change the post since my site was toast. You are warned.

#python #django

I started working with Django last week. The documentation is complete, organized, and located in one indexed portion of the website. You can download a PDF of the entire thing and it's better than any O'Reilly book you could possibly buy about Django. If you land on a page for an old version of the framework, it lets you know.

The same thing goes for Postgres.

The same thing goes for Symfony.

The same thing goes for Rails.

The same thing goes for React.

These are tools that want to be used. It's obvious from the onboarding tutorials in each of these that they want to make the process easy for noobs.

Contrast this with Drupal. I had been poking at and trying to figure out Drupal for almost a year (getting actual work done with Wordpress in the meantime) before I picked up a book that finally cleared it up for me. Oh! Drupal isn't supposed to do anything! You have to go module shopping to make it do simple things! And you have to go buy a book to tell you that!

And the situation has only gotten worse now that Acquia has decided to throw away over a decade of community knowledge about how to build Drupal sites. Where's the simple onboarding tutorial in here (?), because i can't find it.


I'm not saying that Drupal 8 is going to fail – god knows it is a ginormous step forward in SO many ways – but if it does it'll be because the Drupal project takes building things far more seriously than it does anything else, especially teaching others how to use those things. The smartest thing that Acquia could do at this point for the future of Drupal would be to put a complete moratorium on any new features until the currently existing features are covered with this level of official documentation.

#drupal

So before the pitchforks come out, this is where this blog post came from.

I need someone to submit “Why is drupal uncool?” to https://t.co/WkZUojaBkS #DrupalCon

— Cathy YesCT (@YesCT) February 22, 2016

YesCT started riffing on great ideas for DrupalCon session. Alas I didn't submit any of them, I submitted my session about how Paragraphs module got my mojo back. It was not accepted, which is kind of a relief.

Anyway, I have a few ideas on this topic. Let's see what falls out.


It's PHP

So yes, PHP. AFAIK in all of it's 20+ year history PHP has never ever been the Hot New Thing. It has things to recommend it, namely the deployment story which enables folks to get started building things immediately and not have to worry about setting up anything on a server. If there were any other option that were this simple, PHP probably would not be the thing that it is today, but for that horse a kingdom has been built.

Just Google “why PHP sucks” for more on this topic. In a nutshell it's maddeningly inconsistent. I've been working with Python for all of 4 weeks and I already know it better than I know PHP in 8 years of working with it. “The API is consistent”, which is a fancy way of saying that you don't have to look up the order of the arguments for a function every single time you want to use that function. Quick, array_push() – is it haystack/needle or needle/haystack?

Drupal suffers from this spillover, as do basically all PHP frameworks. Until D8, Drupal didn't even have the benefit of object orientation (by benefit I mean, it makes it cooler).

It (historically) tries way too hard

Remember this – http://certifiedtorock.com/? Sadly, this is still a thing on the internet.

Or what about this one?

webchick world tour

I'm super, duper sorry for dredging this up, because I have nothing but respect for Webchick. Seems like everything she writes is so dead on and the respect that she has in the community probably surpasses what even Dries gets. But I'm not sure who thought this was a good idea.

In this regard, Drupal is like me at 15 or so. I was not cool, but I went and bought Doc Martins anyway because the cool kids wore em. I think both of these are around the same era – the time that Rails was absolutely the White Hottest New Thing and everyone else was trying to keep up.

The documentation sucks

And this is the real point here. I started working with Django last week. The documentation is complete, organized, and located in one indexed portion of the website. You can download a PDF of the entire thing and it's better than any O'Reilly book you could possibly buy about Django. If you land on a page for an old version of the framework, it lets you know.

The same thing goes for Postgres.

The same thing goes for Symfony.

The same thing goes for Rails.

The same thing goes for React.

These are open source projects that want to be used. They put forth as much effort into the documentation for the things they've built as they put into the things themselves. It's impossible as a Drupal dev to see documentation like this and not feel like “wow, this is a toolkit that takes itself seriously and respects folks who want to use it”. It's impossible as a Drupal dev to see documentation like this and not feel like “WTF is Acquia doing?”

Drupal 8 will not succeed until it has documentation like this. I have contributed to the Drupal 8 User Guide, but good luck finding it. You can find the project page, but the actual documentation is nowhere on the front page results of that search.

[Rather large rant about the JS framework in core discussion redacted.]

#drupal

So here it is. The last version of this blog – a Rails frontend to a Postgres backend – actually stood for almost 2 and a half years. I think that's probably a record.

In keeping with my decided new theme for this blog however, I've decided to rewrite the thing in Django. Not that you can't google it yourself, but Django is (at a high level) basically the Python version of Rails. Actually, it's basically the Python version of every MVC web framework. It's been around for 10 years, so it is far from the hot-new-thing. I've finally been doing this for long enough that I shy away from the hot-new-thing and actively seek out boring, tested solutions to problems.

At work we've begun a small project that we were targeting to build on Drupal 8. Faced with the timeframe, the relative lack of basic modules for building Drupal 8 sites, and the learning curve for the code that we'd inevitably have to write on our own I pitched the idea to my team to try something completely different. I prefaced it with “this is a terrible idea, so raise your hand at any point”, but surprisingly they were all amenable. We all spent a day going through the amazing tutorial and the amazing documentation and they were still on board. So I decided to rebuild this blog to take the training wheels off and give us all some reference code for some of the simple features that weren't walked through in the tutorial – taxonomy, sitemaps, extending templates, etc.

Amazingly it took me all of 4 hours to rebuild the whole thing and migrate the data from one PG schema into the one that Django wants to use. Django is even easier to use than Rails – a fact that blew my mind once I started playing with it.

The deployment story however, is a shit show. I spent as many days trying to get this thing up on a Digital Ocean server as I spent hours building the application in the first place. I'm hoping to find that there is an easier, more modern means for serving Python apps in 2016 after some more digging.

Anyway, thanks for stopping by!

#generaldevelopment #python #django