Dec. 27, 2011

Is text mostly uppercase? PHP function

If you have to estimate whether a string is mostly made out of uppercase or lowercase characters, this might work for you:

function is_mostly_uppercase( $s ) {
	$r = count_chars( $s );
	$upper = array_slice( $r, 65, 26 );	// A-Z
	$lower = array_slice( $r, 97, 26 );	// a-z
	$upper_sum = array_sum( $upper );
	$lower_sum = array_sum( $lower );

	if( $upper_sum > $lower_sum ) return( true );

	return( false );
}

One of the lesser-used PHP functions is count_chars() - it gives a histogram of ASCII characters used in a string. It has its limits, but, this might work for somewhat predictable strings.

This is a graphical representation of $r, after count_chars($string).  Each box represents an array element, and subsequently character list from 0 to 255. Red boxes are uppercase characters (range 65-91) and green boxes are lowercase characters (97-123).  Using array_sum() on red/green slices gives you a count number to compare.

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
1
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
1
1
0
0
0
0
2
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

Above graphic represents guts of this output:

var_dump( is_mostly_uppercase( 'This IS A senTEnCe' )) = bool(false)

No comments \ Leave a comment
Posts on this blog solely represent my personal opinions and technical experience.

© 2009-2017 Edin (Dino) Beslagic