How to improve WordPress user registration validation

OVERVIEW

When someone registers for a user account on your WordPress site, WordPress sends the user registration form to the register_new_user( ) function. This function does some validation of the requested user name and the supplied user email address. But overall, very little validation is done on the submitted form. This article provides several improvement you can make to this process to reduce the number of junk registrations on your WordPress site.

BACKGROUND

By itself, the register_new_user( ) function in WordPress performs the following forms of validation on new user registrations…

  • Check that the username is not empty.
  • Remove character accents, pre-encoded entities, HTML tags, and collapse consecutive whitespaces to a single space. sanitize_user( )
  • Check that the user name does not already exist. username_exists( )
  • Check that the email address is not empty.
  • WordPress “rolled its own” email validation function called is_email( ). The developers admit in the documentation and the code comments that this function does not properly validate international domains, and does not correctly test for invalid characters. It also does not distinguish between ASCII and UTF-8 encoding of form data. This entire function could have been replaced with a call to:
    filter_var( $user_email, FILTER_VALIDATE_EMAIL, FILTER_FLAG_EMAIL_UNICODE )

    Older versions of PHP’s FILTER_VALIDATE_EMAIL filter were not entirely RFC compliant. So this WordPress function may be an old attempt at doing a better job than PHP, and it is no longer necessary.

  • Check if a user already exists with the same email address.

We can add code, either to the WordPress theme’s code or as a plugin, to improve sanitization and validation of the register_new_user( ).


VALIDATION IMPROVEMENT IDEAS

MX Resource Record lookup

First up, we can easily check if the domain in the user email address is capable of receiving email. We do this by using the checkdnsrr( ) PHP function to see if an MX record exists for the domain. Without an MX record for a domain, email can not be delivered. The following code will cut down on junk registrations with email addresses with random text in the domain name.  Unfortunately, it can not stop random text account names added to common free email services like GMail or Outlook.

/**
 * Test if there is an MX record for registered email address.
 *
 * @param WP_Error $errors
 * @param string $sanitized_user_login
 * @param string $user_email
 */
add_filter( 'registration_errors', function ( $errors, $sanitized_user_login, $user_email ) {

	/* Pop off the domain from the supplied email address. */
	$domain = array_pop( explode( '@', $user_email ) );

	/* Check email domain name for MX records. */
	if ( false === checkdnsrr( $domain . '.', 'MX' ) ) {
		$errors->add( 'mx_error', 
				sprintf( __( 'ERROR: Domain for %s does not accept email.' ), 
					   esc_html( $user_email )
		);
	}

	return $errors;

}, 10, 3 );

 

Fix common email address typos

Ordinarily, you should just reject and report any invalid form field. But some email address typos are so common that a little help might be appreciated. The user_registration_email hook gives us access to the user’s email address before validation begins. We can fix some common typos, then send it back to begin WordPress’ email validation process.

/**
 * Clean up common email typos.
 * Use PHP's email sanitation function too.
 *
 * @param string $user_email
 */
add_filter( 'user_registration_email', function ( $user_email ) {
	$search	 = [ ' ', ',', '..', '/', '>', '#', '!' ];
	$replace = [ '', '.', '.', '.', '.', '@', '@' ];

	$user_email = str_replace( $search, $replace, strtolower( trim( $user_email ) ) );
	return filter_var( $user_email, FILTER_SANITIZE_EMAIL );
}, 1 ); // Execute early in the filter list.

 

Replace is_email( ) validation

The function is_email( ) has an unusual filter hook design. During the email validation process, the function applies the ‘is_mail’ filter several times. Each time the filter is applied, a context is supplied, such as, ‘local_invalid_chars’ or ‘sub_invalid_chars’.  But the is_email( ) function returns the filter result immediately. If your filter received a context of ‘local_invalid_chars’ because local part of the email address was valid UTF-8 characters but WordPress did not realize that, you could return the email address to prevent the validation from failing. However, the entire validation process may not have executed when your filter received this context. To properly design your filter, you would need to replace the entire validation process.

Luckily, this is very easy if you use PHP’s FILTER_VALIDATE_EMAIL function. The following is what that filter would look like. Just in case there are plugins that register an ‘is_mail’ filter, we set the filter priority to PHP_INT_MAX to try our best to make our filter execute last.

/**
 * Replace WordPress's is_email() validation.
 * This short circuits all errors and successes, then applies PHP's FILTER_VALIDATE_EMAIL.
 *
 * @param string|false $is_email  The email address if successfully passed the is_email() checks, false otherwise.
 * @param string $email  The email address being checked.
 * @param string $context  Context under which the email was tested.
 */
add_filter( 'is_email', function ( $is_email, $email, $context ) {
	return filter_var( $email, FILTER_VALIDATE_EMAIL, FILTER_FLAG_EMAIL_UNICODE );
}, PHP_INT_MAX, 3 ); // Execute late in the filter list, accept three parameters.

 

Block list of user names

An additional validation check was not mentioned at the start of this article; check the requested username against a list of blocked names. In WordPress’ default state, the list of “illegal” user names is empty. However, you can add your own list of usernames to block by using the ‘illegal_user_logins’ filter hook. Even though the default list is empty, plugins might modify this. To avoid overwriting any other added lists, merge your list with the supplied array.  Note that the comparison of names will be case-insensitive.

/**
 * Merge an array of unwanted user names with existing list of blocked names.
 *
 * @param array $usernames  Array of disallowed usernames.
 */
add_filter( 'illegal_user_logins', function ( $usernames ) {
	// Using a static list of unwanted names.
	static $bad_names = [ 'webmaster', 'admin', 'cussword', 'swearword' ];

	/**
	 * Or use the Settings API to allow creating a list from admin panel.
	 */
	// $bad_names = get_option( 'bad_names', [] );

	return array_unique( array_merge( $usernames, $bad_names ) );
} );

 

Block list of email addresses

There isn’t any code in register_new_user( ) that blocks disallowed email addresses. The following code shows how you could block registrations that request specific email addresses. You could also include code that blocks any email address that uses your web site’s domain name. Just remember to allow through any specific domain emails addresses you do want to use on your site. When blocking user names or email addresses this will prevent any changes to existing accounts if they use a blocked user name or email addresses. Think carefully about what you are blocking.

/**
 * Block unwanted email addresses.
 *
 * @param WP_Error $errors
 * @param string $sanitized_user_login
 * @param string $user_email
 */
add_filter( 'registration_errors', function ( $errors, $sanitized_user_login, $user_email ) {
	// Using a static list of unwanted email addresses.
	static $forbidden_emails = [ '[email protected]', '[email protected]' ];

	/**
	 * Or use the Settings API to allow creating a list from admin panel.
	 */
	// $forbidden_emails = get_option( 'forbidden_emails', [] );

	if ( in_array( strtolower( $user_email ), array_map( 'strtolower', $forbidden_emails ), true ) ) {
		$errors->add( 'forbidden_email', __( 'Error: Sorry, that email address is not allowed.' ) );
	}

	return $errors;
}, 10, 3 );

 

Block list of email address domains

There are numerous free temporary email services that forum spammers use to register on WordPress sites. You could block entire domains from being used to register on your web site. Just be aware that this is a game of wack-a-mole. The nanosecond you build a domain block list, you can be sure several new temporary email services will pop-up somewhere else.

You could create a scheduled event to download a regularly updated open-source disposable email domain blocklist every day, then parse it into a Settings API option for use in the example function shown below.

/**
 * Block unwanted email address domains.
 *
 * @param WP_Error $errors
 * @param string $sanitized_user_login
 * @param string $user_email
 */
add_filter( 'registration_errors', function ( $errors, $sanitized_user_login, $user_email ) {
	// Using a static list of unwanted partial domains.
	static $forbidden_domains = [ '0815.', '10-minute-mail.', '10minutemail.', '.anonaddy.',
		'borged.', 'bump.email', 'bumpmail.', 'burnermail.', 'buttondown.email', 'backlav.',
		'ckptr.com', 'chewydonut.', 'cloudmailin.',
		'deadfake.', 'deadspam.', '.debugmail.io', 'despam.', 'despammed.', 'dev-null.', 'developermail.', 'discard.',
		'email-fake.', 'emailage.', 'emaildrop.io', 'emailz.', 'explodemail.',
		'forgetmail.', 'forspam.', 'hot-mail.', 'hottempmail.', 'hypenated-domain.',
		'inboxkitten.com', 'ilovespam.', 'itsjiff.com',
		'junkie.', 'junkmail.', 'killmail.', 'lilspam.com', 'lyft.live',
		'mail7.io', 'mailbiscuit.com', 'mailhazard.', 'mailinator.', 'mailmenot.', 'mailtrash.',
		'nevermails.', 'nobugmail.', 'nomail.', 'nonspam.', 'nonspammer.',
		'oneoffemail.', 'oneoffmail.', 'one-time.email',
		'pizzajunk.', 'pleasenoham.', 'putthisinyourspamdatabase.',
		'realquickemail.', 'rejectmail.', 'selfdestructingmail.', 'sharkfaces.',
		'silenceofthespam.', 'spambob.', 'spambog.', 'spamfellas.', 'spamfighter.', 'spamsandwich.',
		'temp-mail.', 'tempmail.', 'thespamfather.', 'throwawayemailaddress.', 'throwawaymail.', 'trashmail.',
		'whaaaaaaaaaat.', 'wegwerf-email.', 'wegwerfemail.', 'wpdork.',
		'yourspamgoesto.', 'yxzx.', 'zoemail.'
	];

	/**
	 * Or use the Settings API to allow creating a list from admin panel.
	 * Or, you could create a WP_CRON event to automatically parse an
	 * open-source block list into this option.
	 */
	// $forbidden_domains = get_option( 'forbidden_domains', [] );

	$user_domain = array_pop( explode( '@', $user_email ) );

	foreach ( $forbidden_domains as $domain_part ) {
		if ( str_contains( $user_domain, $domain_part ) ) {
			$errors->add( 'forbidden_email', __( 'Error: Sorry, that email service is not allowed.' ) );
			return $errors;
		}
	}

	return $errors;
}, 10, 3 );

 

You could avoid the headache of maintaining a domain block list by using an email validator API, such as EmailValidation.io or MailCheck.ai. These services even provide a “Did you mean?” suggestion to make your interface more user friendly. If you do not need a lot of email validations per day, then these services are free. Some provide free WordPress plugins so you can avoid any programming at all.

WARNING: Since these third-party services require you to send the email address to be validated to their servers, you could be violating the privacy of your new user’s email address, possibly exposing them to spammers. From this perspective, you may begin to wonder how these services could afford to be free at all unless they are selling the data you send them.

Since these services can’t really test the validity of the local part of the email address, you could replace the local part with random text before submitting it for validation. For example, you could transform [email protected] to [email protected] before sending it to the third-party validation service. Just remember that any “Did you mean?” suggestion will be rendered useless when you do this, but your user’s privacy will be protected. In code, this is as simple as…

$obfuscated_email = substr( shuffle( range( 'a', 'z' ) ), 0, 8 ) . '@' .
			array_pop( explode( '@', $user_email ) );

 

Add a nonce to the registration form

WordPress uses cryptographic nonces to help protect forms from replay attacks. But WordPress only uses nonces for logged in users, and not when someone tries to register as a new user. How can a nonce help here? We can use WordPress’ nonce generator to protect against malicious bots, at least some of the less sophisticated bots and automation tools. To accomplish this, the code sample shown below does the following.

  • Use the ‘register_form’ action hook to add a hidden input field with no value preset. Also add javascript to the form that requests a nonce through WordPress’ AJAX interface. The hidden input field value is set with the returned nonce. The javascript also performs some simple tests to try to block common bots and automation tools. Bots that do not have javascript engines will fail to correctly fill in this hidden input, and the user registration will fail.
  • Add an AJAX handler function to generate and return nonces to the registration form. The same function checks to see if the WordPress test cookie exists. A nonce won’t be given to bots that don’t properly handle cookies, and the user registration will fail.
  • Add a ‘registration_errors’ filter to validate the received nonce. Bots that fail to return a valid nonce will fail the user registration process.

Detecting non-human systems accessing web sites is an on-going arms-race. The following code does not guarantee that bad actors won’t be able to use automated systems to create bogus accounts on your WordPress site. But it will stop some of the less sophisticated threat actors.

/**
 * Add nonce and script to registration form.
 */
add_action( 'register_form', function () {
	$ajaxurl = admin_url( 'admin-ajax.php', (is_ssl()) ? 'https' : 'http' );
	?>
	<input name="regnum" type="hidden" value="" />
	<script>
		document.addEventListener( 'DOMContentLoaded', ( event ) => {
			if ( ( navigator && navigator.onLine && navigator.cookieEnabled && navigator.language &&
				( navigator.userAgent || navigator.userAgentData ) ) &&
				false === ( '__selenium_unwrapped' in window ) && false === ( '_phantom' in window ) &&
				false === ( 'webdriver' in window ) && false === ( '__nightmare' in window ) ) {
				jQuery.post( '<?php echo $ajaxurl; ?>', 'action=getregnum', function ( data ) {
					try {
						if ( data.regnum ) {
							jQuery( 'form#registerform input[name="regnum"]' ).val( data.regnum );
						}
					} catch ( ex ) {
						window.location.href = 'http://localhost';
					}
				} ).fail( function () {
					window.location.href = 'http://localhost';
				} );
			}
		} );
	</script>
	<?php
} );

/**
 * Handle AJAX requests for user registration form nonce
 */
function ajax_getregnum() {
	/* Make sure WordPress test cookie was received before responding. */
	if ( defined( 'TEST_COOKIE' ) && !empty( $_COOKIE[ TEST_COOKIE ] ) ) {
		wp_send_json( [ "regnum" => wp_create_nonce( 'regnonce' ) ] );
	}
	exit();
}

add_action( 'wp_ajax_getregnum', 'ajax_getregnum' );
add_action( 'wp_ajax_nopriv_getregnum', 'ajax_getregnum' );

/**
 * Verify registration form nonce.
 *
 * @param WP_Error $errors WP_Error object containing errors encountered during registration.
 * @param string $sanitized_user_login User's username after it has been sanitized.
 * @param string $user_email User's email.
 */
add_filter( 'registration_errors', function ( $errors, $sanitized_user_login, $user_email ) {
	/*
	 * Make sure regnum is valid within the last 12 hours.
	 * check_ajax_referer() will call wp_die if nonce is invalid.
	 */
	if ( empty( $_POST[ 'regnum' ] ) || 1 !== check_ajax_referer( 'regnonce', $_POST[ 'regnum' ] ) ) {
		/* Don't waste any more time on these hackers. */
		wp_die();
	}
	/* Return unmodified error object on success. */
	return $errors;
}, 10, 3 );