Writing The Mapper/Router Thing - DWEB 002

Picking up from where we left, the first thing we need to do is to write the mapper/router (I don’t know what it is actually called, but I’m calling it the mapper) thing. The main function of this piece of code is to generate the mapping table for the site source files, and to determine which files will be updated, generated, left alone or removed.

Before writing the function though, we need to add some of my boilerplates:

#!/bin/busybox sh

[ $_ != $0 ] && return 1
set -e
set -u

SCRIPT_VERSION='0.0.1'

SCRIPT_NAME=$( basename ${0} )
BASE_DIR=$( dirname $( realpath ${0} ) )

Here are what these lines do:

  1. #!/bin/busybox sh is used to set the shell used in the script.
  2. [ $_ != $0 ] && return 1 is used to check whether the script is being sourced or run in a subshell. We don’t want it to sourced because it will mess with some environmental variables, namely $0. If the script is sourced it will stop and return an error. We shouldn’t use exit because we don’t want it to stop the execution of the interactive shell or the script that sourced it.
  3. set -e causes the script to exit if a command fails, set -u causes the script to exit when an unset variable is expanded.
  4. SCRIPT_VERSION='0.0.1' is set just in case πŸ˜…
  5. SCRIPT_NAME=$( basename ${0} ) is the script name and BASE_DIR=$( dirname $( realpath ${0} ) ) is the absolute path of the parent directory of the script. Both of these variables would have pointed to the parent shell or script if this script is sourced.

Next we set some other usefull variables, which serve as DWEB default configurations:

SOURCE_DIR="${BASE_DIR%/}/content"
PUBLIC_DIR="${BASE_DIR%/}/public"
TEMPLATES_DIR="${BASE_DIR%/}/templates"
FUNCTIONS_DIR="${BASE_DIR%/}/functions"
CONTENT_EXTENSION='txt'

These are self-explainatory, but to be sure; SOURCE_DIR is where content source files are present, PUBLIC_DIR is where the generated HTML files will go, templates and partials go into TEMPLATES_DIR, FUNCTIONS_DIR is where helper scripts live, and CONTENT_EXTENSION is the extention of content source files that will be converted into HTML files.

After that a helper functions with some sanity checks:

_exitmsg() {
	if [ ${1:=0} -gt 0 ] && [ ${1:=0} -le 255 ]; then
		EXIT_STATUS=${1}
		MSG_DIR=2
		shift
	else
		EXIT_STATUS=0
		MSG_DIR=1
		[ ${1:=0} -eq 0 ] && shift
	fi
	printf '%s: %s\n' "${SCRIPT_NAME}" "${*:-exiting}" >&${MSG_DIR}
	exit ${EXIT_STATUS}
}

The _exitmsg function writes a message to stdout or stderr depending on the first argument, then exits the script. And yes, I use tabs in shell scripts.

_chkreqs() {
	REQUIRED_DIRS=${REQUIRED_DIRS:-}:${SOURCE_DIR}
# 	REQUIRED_DIRS=${REQUIRED_DIRS:-}:${TEMPLATES_DIR}
# 	REQUIRED_DIRS=${REQUIRED_DIRS:-}:${FUNCTIONS_DIR}

	_OLD_IFS=${IFS}
	IFS=':'
	for REQUIRED_DIR in ${REQUIRED_DIRS}; do
		if [ ! -d ${REQUIRED_DIR} ]; then
			FATAL_ERR=1
			printf 'Required directory %s is missing\n' ${REQUIRED_DIR} >&2
		fi
	done
	IFS=${_OLD_IFS}
	unset _OLD_IFS

	[ ${FATAL_ERR:-0} ] || _exitmsg 1 'Fatal error, exiting!'
}

_chkreqs so far checks for the required directories. I know we will need to check for the templates and functions directories later on, so I commented the lines adding them to REQUIRED_DIRS variable.

Next up, this post’s main event, the _mapper function. It’s a bit long so I’ll break it down to digestable bites:

_mapper() {
	[ -d ${SOURCE_DIR} ] || mkdir ${SOURCE_DIR}
	[ -d ${PUBLIC_DIR} ] || mkdir ${PUBLIC_DIR}
	[ -d ${TEMP_DIR} ] || TEMP_DIR=$( mktemp -d )
	FILES_MAP="${TEMP_DIR%/}/map"
	: > ${FILES_MAP}

The first few lines we do some sanity checks, creating the necessary directories. I know I’wee just wrote a function that will yell and scream when one of the required directories is missing. Well, here we’re just doing only some sanity checks for this function and making sure that nothing out of it’s scope will break it, and this also means that checking requirements and spitting errors is not this function’s job. Also the last line : > ${FILES_MAP} creates the temporary file where we will store our map. I like this particular line because it utilizes the command : and because : > looks like this emoji ☺️.

	for SOURCE_FILE in $( find "${SOURCE_DIR%/}/" ); do
		[ -d ${SOURCE_FILE} ] \
			&& [ -f "${SOURCE_FILE%/}/index.${SOURCE_EXT#.}" ] \
			|| [ -f "${SOURCE_FILE%/}/index.html" ] \
			&& continue

		[ -f ${SOURCE_FILE} ] \
			&& [ -f "${SOURCE_FILE%.${SOURCE_EXT#.}}.html" ] \
			&& continue

		SOURCE_SUBDIR="$( dirname ${SOURCE_FILE} )/"
		[ -d ${SOURCE_FILE} ] && SOURCE_SUBDIR=${SOURCE_FILE}

		OUTPUT_DIR="${PUBLIC_DIR%/}/${SOURCE_SUBDIR#${SOURCE_DIR%/}/}"

		FILENAME=$( basename ${SOURCE_FILE} ".${SOURCE_EXT#.}")

		OUTPUT_FILENAME="${FILENAME}.html"
		PARAMETER='leaf'

		[ "${FILENAME}" == 'index' ] \
			&& PARAMETER='branch' \
			&& [ ${OUTPUT_DIR%/} == ${PUBLIC_DIR%/} ] \
			&& PARAMETER='root'

		[ "${FILENAME}" == "$( basename ${SOURCE_FILE} )" ] \
			&& OUTPUT_FILENAME=${FILENAME} \
			&& PARAMETER='static'

		[ -d ${SOURCE_FILE} ] \
			&& OUTPUT_FILENAME='index.html' \
			&& PARAMETER='branch' \
			&& [ ${OUTPUT_DIR%/} == ${PUBLIC_DIR%/} ] \
			&& PARAMETER='root'

		OUTPUT_FILE="${OUTPUT_DIR%/}/${OUTPUT_FILENAME}"

		OPERATION='create'
		[ ${SOURCE_FILE} -ot ${OUTPUT_FILE} ] && OPERATION='keep'
		[ ${SOURCE_FILE} -nt ${OUTPUT_FILE} ] && OPERATION='update'

		printf '%s:%s:%s:%s\n' \
			${OUTPUT_FILE} \
			${OPERATION} \
			${PARAMETER} \
			${SOURCE_FILE} \
			>> ${FILES_MAP}
	done

Next, the big for loop. It loops on every thing in the source directory and performs some checks:

  1. If loop variable is a directory then we know that there should be a relative <directory name>/index.html file in the public directory. So we check once more if there is a content file named index or index.html, if we can’t find any it means that we’ll have to create and index file for this directory from a listing template. If we find an index file, we’ll start the loop again because we know we’ll encounter that file and process it.
  2. We make another check, stay with me on this one; if the loop variable is a file, and that file is a content file, awesome_page.txt for example, and there is a file named awesome_page.html in the same directory, then we’ll start the loop over again because we will encounter that file and process it. These couple of checks sets the priority for generating index.html files: index.html > index.txt > directory listing and for content files in general: content.html > content.txt.
  3. From this point onward, we will assume that we’re dealing with a content file. We will set the variables default values accordingly and change them if we need to. Here we set the output subdirectory variable; for files it’s a mirror of the parent directory, and if a directory passed the first two checks, the output subdirectory mirrors the directory itself.
  4. Since we’re assuming that the current file we’re processing is a content file, we strip the filename from the extension and append .html, this gives us the output file name. Also the template we’re going to use is the leaf template, we’ll get to that in the future.
  5. We check if the filename without .txt our content extention is index. In this case we’re gonna use the branch template, unless of course this index content file is in the root of the source directory, in this particular case we will be using the root template.
  6. But if the filename stays the same after we strip the content extension, it means that it is not actually a content file. Then we need to adjust the output filename to be the same as the input filename without .html and the change the template parameter to static since the source file will be copied as is.
  7. We check again if the loop variable is a directory. By this point, we are sure that this directory does not have an index file and one is needed. So we adjust the variables to reflect that; the output filename is index.html and the template is branch, or root in case the direcory we are handling is the source root directory.
  8. We set the full output filename and we assume that we’re going to create it. But if it exists and the corresponding source file is older, we will keep it as it is. And if the source file is newer, then we’ll have to update the output file. This will be handy when we’re making the navigation bar template later.

After all that, we write the variables as a table in a colon separated values format to the map file. And we start another smaller loop, but this time we loop on the output directory:

	for PUBLIC_FILE in $( find "${PUBLIC_DIR%/}/" ); do
		[ ${PUBLIC_FILE} -ef ${PUBLIC_DIR} ] && continue
		if ( ! grep -E -q "^${PUBLIC_FILE}(:|/)" ${FILES_MAP} ); then
			PUBLIC_SUBDIR=$( dirname ${PUBLIC_FILE} )
			until [ ${PUBLIC_SUBDIR} -ef ${PUBLIC_DIR} ]; do
				grep -q "^${PUBLIC_SUBDIR}:delete:directory" ${FILES_MAP} \
					&& continue 2
				PUBLIC_SUBDIR=$( dirname ${PUBLIC_SUBDIR} )
			done
			PARAMETER='file'
			[ -d ${PUBLIC_FILE} ] && PARAMETER='directory'
			printf '%s:%s:%s\n' \
				${PUBLIC_FILE} \
				'delete' \
				${PARAMETER} \
				>> ${FILES_MAP}
		fi
	done

	sed -i '/^$/d' ${FILES_MAP}
}

This loop simply marks files and directories that does not have a corresponding source file for deletion. We do that by checking if the file’s path exists in the first column of the map, if it is not there we mark it for deletion. Also we check if any of the current file’s parent directories up untill the root of the public directory is destened to be deleted. If yes, we skip this file/directory since it’ll be deleted when we delete that parent directory.

Finally we use sed to remove the empty lines from the file.

Excuting the function with the following directory structure:

dweb/
|-- dweb.sh
|-- public
|   |-- aaa
|   |-- bbb
|   |-- ccc
|   |-- d
|   |-- ddd
|   |   |-- eee
|   |   `-- fff
|   |       `-- ggg
|   |-- dir1
|   |   `-- dir12
|   |       `-- index.html
|   `-- index.html
`-- source
    |-- dir1
    |   |-- page1.txt
    |   |-- page2
    |   `-- page3
    |-- dir2
    |   |-- index.txt
    |   |-- page1
    |   |-- page2.txt
    |   `-- page3
    |-- dir3
    |   |-- index.html
    |   |-- page1
    |   |-- page2
    |   `-- page3.txt
    |-- index.txt
    |-- page1.txt
    |-- page2
    `-- page3

Generates the following map file:

dweb/public/aaa:delete:file
dweb/public/bbb:delete:file
dweb/public/ccc:delete:directory
dweb/public/d:delete:directory
dweb/public/ddd:delete:directory
dweb/public/dir1/dir12:delete:directory
dweb/public/dir1/index.html:create:branch:dweb/source/dir1
dweb/public/dir1/page1.html:create:leaf:dweb/source/dir1/page1.txt
dweb/public/dir1/page2:create:static:dweb/source/dir1/page2
dweb/public/dir1/page3:create:static:dweb/source/dir1/page3
dweb/public/dir2/index.html:create:branch:dweb/source/dir2/index.txt
dweb/public/dir2/page1:create:static:dweb/source/dir2/page1
dweb/public/dir2/page2.html:create:leaf:dweb/source/dir2/page2.txt
dweb/public/dir2/page3:create:static:dweb/source/dir2/page3
dweb/public/dir3/index.html:create:static:dweb/source/dir3/index.html
dweb/public/dir3/page1:create:static:dweb/source/dir3/page1
dweb/public/dir3/page2:create:static:dweb/source/dir3/page2
dweb/public/dir3/page3.html:create:leaf:dweb/source/dir3/page3.txt
dweb/public/index.html:keep:root:dweb/source/index.txt
dweb/public/page1.html:create:leaf:dweb/source/page1.txt
dweb/public/page2:create:static:dweb/source/page2
dweb/public/page3:create:static:dweb/source/page3

I didn’t have the time to test all the corner cases of this code and it is no where near finished. But that’s it for now, till next tile πŸ‘‹.