Writing a ~100Kb HTML string over an MD file (number of iterations ~10K)Writing strcat (string concatenate)...

Biological Blimps: Propulsion

Engineer refusing to file/disclose patents

MAXDOP Settings for SQL Server 2014

anything or something to eat

Is it improper etiquette to ask your opponent what his/her rating is before the game?

What is the gram­mat­i­cal term for “‑ed” words like these?

Has Darkwing Duck ever met Scrooge McDuck?

Filling the middle of a torus in Tikz

Folder comparison

Character escape sequences for ">"

Should I stop contributing to retirement accounts?

Translation of Scottish 16th century church stained glass

Query about absorption line spectra

Can I use my Chinese passport to enter China after I acquired another citizenship?

Can a Necromancer Reuse the corpses left behind from slain Undead?

When quoting, must I also copy hyphens used to divide words that continue on the next line?

Open a doc from terminal, but not by its name

A social experiment. What is the worst that can happen?

Open problems concerning all the finite groups

Varistor? Purpose and principle

Is there a conventional notation or name for the slip angle?

How should I respond when I lied about my education and the company finds out through background check?

Can someone explain how this makes sense electrically?

Python script not running correctly when launched with crontab



Writing a ~100Kb HTML string over an MD file (number of iterations ~10K)


Writing strcat (string concatenate) in COptimize way of reading and writing file in node.jsLooping over files in C++ and changing their names to store data into a single fileWriting strings to a fileGenerate Random String and Writing to FileReading and writing to fileJava String iterationsWriting strings to the screen over timeReplace string in fileFormatting fractions found in a given string to html fraction













2












$begingroup$


I have tried to write a large string (~100-120 Kb of HTML) on an md file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.



There is also a low (~1%-2%) probability that the target filename has an old name (previousName), not exactly matched with a new name (newName), because the data flows through an API.



Key Script: Inside For Loop



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market

if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}

$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a

$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist

// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}

// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;



// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);


// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}


// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML

if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}


searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


var_dump($hay)



string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...


>5% Probability of var_dump($lurl)



For strtolower($symb)===strtolower($cn)



string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"


<95% Probability of var_dump($lurl)



For not strtolower($symb)===strtolower($cn)



string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"



$symb:



Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.



"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"


Would you be so kind and help me to modify it with a faster/simpler script?










share|improve this question











$endgroup$








  • 1




    $begingroup$
    You could do substr($hey, 0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 19:27








  • 1




    $begingroup$
    @Emma I have advice to give, but I don't know what $symb is / is doing.
    $endgroup$
    – mickmackusa
    Mar 17 at 23:44








  • 1




    $begingroup$
    Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is $symb? (If you "ping" me on pages where I have not been active, I will not be alerted.)
    $endgroup$
    – mickmackusa
    Mar 18 at 0:01


















2












$begingroup$


I have tried to write a large string (~100-120 Kb of HTML) on an md file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.



There is also a low (~1%-2%) probability that the target filename has an old name (previousName), not exactly matched with a new name (newName), because the data flows through an API.



Key Script: Inside For Loop



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market

if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}

$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a

$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist

// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}

// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;



// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);


// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}


// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML

if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}


searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


var_dump($hay)



string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...


>5% Probability of var_dump($lurl)



For strtolower($symb)===strtolower($cn)



string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"


<95% Probability of var_dump($lurl)



For not strtolower($symb)===strtolower($cn)



string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"



$symb:



Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.



"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"


Would you be so kind and help me to modify it with a faster/simpler script?










share|improve this question











$endgroup$








  • 1




    $begingroup$
    You could do substr($hey, 0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 19:27








  • 1




    $begingroup$
    @Emma I have advice to give, but I don't know what $symb is / is doing.
    $endgroup$
    – mickmackusa
    Mar 17 at 23:44








  • 1




    $begingroup$
    Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is $symb? (If you "ping" me on pages where I have not been active, I will not be alerted.)
    $endgroup$
    – mickmackusa
    Mar 18 at 0:01
















2












2








2





$begingroup$


I have tried to write a large string (~100-120 Kb of HTML) on an md file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.



There is also a low (~1%-2%) probability that the target filename has an old name (previousName), not exactly matched with a new name (newName), because the data flows through an API.



Key Script: Inside For Loop



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market

if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}

$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a

$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist

// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}

// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;



// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);


// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}


// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML

if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}


searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


var_dump($hay)



string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...


>5% Probability of var_dump($lurl)



For strtolower($symb)===strtolower($cn)



string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"


<95% Probability of var_dump($lurl)



For not strtolower($symb)===strtolower($cn)



string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"



$symb:



Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.



"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"


Would you be so kind and help me to modify it with a faster/simpler script?










share|improve this question











$endgroup$




I have tried to write a large string (~100-120 Kb of HTML) on an md file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.



There is also a low (~1%-2%) probability that the target filename has an old name (previousName), not exactly matched with a new name (newName), because the data flows through an API.



Key Script: Inside For Loop



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market

if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}

$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a

$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist

// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}

// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;



// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);


// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}


// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML

if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}


searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


var_dump($hay)



string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...


>5% Probability of var_dump($lurl)



For strtolower($symb)===strtolower($cn)



string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"


<95% Probability of var_dump($lurl)



For not strtolower($symb)===strtolower($cn)



string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"



$symb:



Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.



"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"


Would you be so kind and help me to modify it with a faster/simpler script?







performance beginner php strings file-system






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 18 at 0:15







Emma

















asked Mar 16 at 19:19









EmmaEmma

198112




198112








  • 1




    $begingroup$
    You could do substr($hey, 0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 19:27








  • 1




    $begingroup$
    @Emma I have advice to give, but I don't know what $symb is / is doing.
    $endgroup$
    – mickmackusa
    Mar 17 at 23:44








  • 1




    $begingroup$
    Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is $symb? (If you "ping" me on pages where I have not been active, I will not be alerted.)
    $endgroup$
    – mickmackusa
    Mar 18 at 0:01
















  • 1




    $begingroup$
    You could do substr($hey, 0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 19:27








  • 1




    $begingroup$
    @Emma I have advice to give, but I don't know what $symb is / is doing.
    $endgroup$
    – mickmackusa
    Mar 17 at 23:44








  • 1




    $begingroup$
    Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is $symb? (If you "ping" me on pages where I have not been active, I will not be alerted.)
    $endgroup$
    – mickmackusa
    Mar 18 at 0:01










1




1




$begingroup$
You could do substr($hey, 0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27






$begingroup$
You could do substr($hey, 0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27






1




1




$begingroup$
@Emma I have advice to give, but I don't know what $symb is / is doing.
$endgroup$
– mickmackusa
Mar 17 at 23:44






$begingroup$
@Emma I have advice to give, but I don't know what $symb is / is doing.
$endgroup$
– mickmackusa
Mar 17 at 23:44






1




1




$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is $symb? (If you "ping" me on pages where I have not been active, I will not be alerted.)
$endgroup$
– mickmackusa
Mar 18 at 0:01






$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is $symb? (If you "ping" me on pages where I have not been active, I will not be alerted.)
$endgroup$
– mickmackusa
Mar 18 at 0:01












2 Answers
2






active

oldest

votes


















5












$begingroup$

This is what I meant in the comments




You could do substr($hey,0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.




  switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}


This way your doing 1 function call instead of up to 4.



It looks like your calling strtolower more than 3 times on $cn



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');

if(strtolower($symb)===strtolower($cn)){


And so forth.



There may be other duplicate calls like this.



Your sprintf seem point less.



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

//you could just do this for example
$p1=self::SLASH.$symb.'-';


This whole chunk is suspect:



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}


For example the only difference is this:



$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;


So if you could change the last argument and prepend $symb, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:



  if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}

$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";


But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.



For this one:



searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


You can use preg_grep. For example:



   public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}


Your just finding if $re is contained within each element of $array. preg_grep — Return array entries that match the pattern. It's also case insensitive with the i flag. In any case I never use array_push as $arr[]=$str is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.



One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.



One last thing this one scares me a bit:



foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Here your checking that $k or the array key is 0, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.



foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.



Hope it helps you, most of these are minor things, really.






share|improve this answer











$endgroup$









  • 2




    $begingroup$
    Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:06










  • $begingroup$
    @Emma - is this part of your code? UpdateStocks::searchFilenames you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:09








  • 1




    $begingroup$
    Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:29








  • 1




    $begingroup$
    I type A LOT of code, so I try to type as little as possible, lol.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:43



















1












$begingroup$

I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)




  1. I go to great lengths to avoid the use of "large-battery" if blocks and switch blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call it const EXCHANGE_CODES), you can initiate an array to store the running tally for each market code encountered (I'll call it $exchange_counts); this array should be declared one time before the loop is started. Inside the loop, you can use strpos() and substr() to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key in EXCHANGE_CODES, declare the found associated value, and increment the respective exchange count.


  2. I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.


  3. When preparing the sector slug value, you are passing single-element arrays to str_replace() this is unnecessary -- just pass as single strings.


  4. Use a single space after commas when writing function parameters, as well as on either side of all =.


  5. I don't know if $s and $arr are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing the quote subarray, then you might like to declare $quote = $array['quote']; early in your script to allow for the simpler use of $quote. This isn't a big deal, just something to consider.


  6. Changing your current working directory to the new directory will spare you needing to add the variable to the glob() parameter AND it will shorten the strings that are being filtered -- meaning less work for php.


  7. You can put glob()'s excellent filtering feature to good use and avoid calling your static method entirely.



Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.



Here's some untested code to reflect my advice:



// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];

// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/

// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));

$exchange_market = strtolower($arr["quote"]["primaryExchange"]);

// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];

$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));

$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);

$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;

if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}

$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;

$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);

if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}





share|improve this answer









$endgroup$









  • 1




    $begingroup$
    Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that chdir() works on the subsequent iterations. (or you can ignore the chdir() advice and just pad the strings with $dir as you perform your glob() and unlink() processes)
    $endgroup$
    – mickmackusa
    Mar 18 at 1:58












  • $begingroup$
    On second thought, $dir probably takes care of that concern.
    $endgroup$
    – mickmackusa
    Mar 18 at 5:29











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215578%2fwriting-a-100kb-html-string-over-an-md-file-number-of-iterations-10k%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









5












$begingroup$

This is what I meant in the comments




You could do substr($hey,0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.




  switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}


This way your doing 1 function call instead of up to 4.



It looks like your calling strtolower more than 3 times on $cn



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');

if(strtolower($symb)===strtolower($cn)){


And so forth.



There may be other duplicate calls like this.



Your sprintf seem point less.



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

//you could just do this for example
$p1=self::SLASH.$symb.'-';


This whole chunk is suspect:



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}


For example the only difference is this:



$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;


So if you could change the last argument and prepend $symb, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:



  if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}

$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";


But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.



For this one:



searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


You can use preg_grep. For example:



   public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}


Your just finding if $re is contained within each element of $array. preg_grep — Return array entries that match the pattern. It's also case insensitive with the i flag. In any case I never use array_push as $arr[]=$str is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.



One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.



One last thing this one scares me a bit:



foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Here your checking that $k or the array key is 0, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.



foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.



Hope it helps you, most of these are minor things, really.






share|improve this answer











$endgroup$









  • 2




    $begingroup$
    Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:06










  • $begingroup$
    @Emma - is this part of your code? UpdateStocks::searchFilenames you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:09








  • 1




    $begingroup$
    Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:29








  • 1




    $begingroup$
    I type A LOT of code, so I try to type as little as possible, lol.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:43
















5












$begingroup$

This is what I meant in the comments




You could do substr($hey,0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.




  switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}


This way your doing 1 function call instead of up to 4.



It looks like your calling strtolower more than 3 times on $cn



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');

if(strtolower($symb)===strtolower($cn)){


And so forth.



There may be other duplicate calls like this.



Your sprintf seem point less.



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

//you could just do this for example
$p1=self::SLASH.$symb.'-';


This whole chunk is suspect:



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}


For example the only difference is this:



$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;


So if you could change the last argument and prepend $symb, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:



  if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}

$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";


But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.



For this one:



searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


You can use preg_grep. For example:



   public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}


Your just finding if $re is contained within each element of $array. preg_grep — Return array entries that match the pattern. It's also case insensitive with the i flag. In any case I never use array_push as $arr[]=$str is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.



One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.



One last thing this one scares me a bit:



foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Here your checking that $k or the array key is 0, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.



foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.



Hope it helps you, most of these are minor things, really.






share|improve this answer











$endgroup$









  • 2




    $begingroup$
    Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:06










  • $begingroup$
    @Emma - is this part of your code? UpdateStocks::searchFilenames you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:09








  • 1




    $begingroup$
    Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:29








  • 1




    $begingroup$
    I type A LOT of code, so I try to type as little as possible, lol.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:43














5












5








5





$begingroup$

This is what I meant in the comments




You could do substr($hey,0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.




  switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}


This way your doing 1 function call instead of up to 4.



It looks like your calling strtolower more than 3 times on $cn



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');

if(strtolower($symb)===strtolower($cn)){


And so forth.



There may be other duplicate calls like this.



Your sprintf seem point less.



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

//you could just do this for example
$p1=self::SLASH.$symb.'-';


This whole chunk is suspect:



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}


For example the only difference is this:



$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;


So if you could change the last argument and prepend $symb, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:



  if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}

$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";


But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.



For this one:



searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


You can use preg_grep. For example:



   public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}


Your just finding if $re is contained within each element of $array. preg_grep — Return array entries that match the pattern. It's also case insensitive with the i flag. In any case I never use array_push as $arr[]=$str is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.



One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.



One last thing this one scares me a bit:



foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Here your checking that $k or the array key is 0, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.



foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.



Hope it helps you, most of these are minor things, really.






share|improve this answer











$endgroup$



This is what I meant in the comments




You could do substr($hey,0,2) once and then check the first two letters instead of multiple strpos. Maybe, but I have no idea what $hey looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.




  switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}


This way your doing 1 function call instead of up to 4.



It looks like your calling strtolower more than 3 times on $cn



  $cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');

if(strtolower($symb)===strtolower($cn)){


And so forth.



There may be other duplicate calls like this.



Your sprintf seem point less.



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

//you could just do this for example
$p1=self::SLASH.$symb.'-';


This whole chunk is suspect:



 // symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');

// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}


For example the only difference is this:



$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;


So if you could change the last argument and prepend $symb, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:



  if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}

$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";


But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.



For this one:



searchFilenames



/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}


You can use preg_grep. For example:



   public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}


Your just finding if $re is contained within each element of $array. preg_grep — Return array entries that match the pattern. It's also case insensitive with the i flag. In any case I never use array_push as $arr[]=$str is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.



One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.



One last thing this one scares me a bit:



foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Here your checking that $k or the array key is 0, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.



foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}


Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.



Hope it helps you, most of these are minor things, really.







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 16 at 20:55

























answered Mar 16 at 19:33









ArtisticPhoenixArtisticPhoenix

37617




37617








  • 2




    $begingroup$
    Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:06










  • $begingroup$
    @Emma - is this part of your code? UpdateStocks::searchFilenames you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:09








  • 1




    $begingroup$
    Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:29








  • 1




    $begingroup$
    I type A LOT of code, so I try to type as little as possible, lol.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:43














  • 2




    $begingroup$
    Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:06










  • $begingroup$
    @Emma - is this part of your code? UpdateStocks::searchFilenames you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:09








  • 1




    $begingroup$
    Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:29








  • 1




    $begingroup$
    I type A LOT of code, so I try to type as little as possible, lol.
    $endgroup$
    – ArtisticPhoenix
    Mar 16 at 20:43








2




2




$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06




$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06












$begingroup$
@Emma - is this part of your code? UpdateStocks::searchFilenames you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09






$begingroup$
@Emma - is this part of your code? UpdateStocks::searchFilenames you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09






1




1




$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29






$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29






1




1




$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43




$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43













1












$begingroup$

I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)




  1. I go to great lengths to avoid the use of "large-battery" if blocks and switch blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call it const EXCHANGE_CODES), you can initiate an array to store the running tally for each market code encountered (I'll call it $exchange_counts); this array should be declared one time before the loop is started. Inside the loop, you can use strpos() and substr() to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key in EXCHANGE_CODES, declare the found associated value, and increment the respective exchange count.


  2. I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.


  3. When preparing the sector slug value, you are passing single-element arrays to str_replace() this is unnecessary -- just pass as single strings.


  4. Use a single space after commas when writing function parameters, as well as on either side of all =.


  5. I don't know if $s and $arr are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing the quote subarray, then you might like to declare $quote = $array['quote']; early in your script to allow for the simpler use of $quote. This isn't a big deal, just something to consider.


  6. Changing your current working directory to the new directory will spare you needing to add the variable to the glob() parameter AND it will shorten the strings that are being filtered -- meaning less work for php.


  7. You can put glob()'s excellent filtering feature to good use and avoid calling your static method entirely.



Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.



Here's some untested code to reflect my advice:



// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];

// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/

// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));

$exchange_market = strtolower($arr["quote"]["primaryExchange"]);

// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];

$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));

$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);

$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;

if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}

$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;

$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);

if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}





share|improve this answer









$endgroup$









  • 1




    $begingroup$
    Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that chdir() works on the subsequent iterations. (or you can ignore the chdir() advice and just pad the strings with $dir as you perform your glob() and unlink() processes)
    $endgroup$
    – mickmackusa
    Mar 18 at 1:58












  • $begingroup$
    On second thought, $dir probably takes care of that concern.
    $endgroup$
    – mickmackusa
    Mar 18 at 5:29
















1












$begingroup$

I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)




  1. I go to great lengths to avoid the use of "large-battery" if blocks and switch blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call it const EXCHANGE_CODES), you can initiate an array to store the running tally for each market code encountered (I'll call it $exchange_counts); this array should be declared one time before the loop is started. Inside the loop, you can use strpos() and substr() to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key in EXCHANGE_CODES, declare the found associated value, and increment the respective exchange count.


  2. I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.


  3. When preparing the sector slug value, you are passing single-element arrays to str_replace() this is unnecessary -- just pass as single strings.


  4. Use a single space after commas when writing function parameters, as well as on either side of all =.


  5. I don't know if $s and $arr are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing the quote subarray, then you might like to declare $quote = $array['quote']; early in your script to allow for the simpler use of $quote. This isn't a big deal, just something to consider.


  6. Changing your current working directory to the new directory will spare you needing to add the variable to the glob() parameter AND it will shorten the strings that are being filtered -- meaning less work for php.


  7. You can put glob()'s excellent filtering feature to good use and avoid calling your static method entirely.



Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.



Here's some untested code to reflect my advice:



// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];

// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/

// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));

$exchange_market = strtolower($arr["quote"]["primaryExchange"]);

// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];

$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));

$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);

$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;

if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}

$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;

$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);

if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}





share|improve this answer









$endgroup$









  • 1




    $begingroup$
    Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that chdir() works on the subsequent iterations. (or you can ignore the chdir() advice and just pad the strings with $dir as you perform your glob() and unlink() processes)
    $endgroup$
    – mickmackusa
    Mar 18 at 1:58












  • $begingroup$
    On second thought, $dir probably takes care of that concern.
    $endgroup$
    – mickmackusa
    Mar 18 at 5:29














1












1








1





$begingroup$

I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)




  1. I go to great lengths to avoid the use of "large-battery" if blocks and switch blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call it const EXCHANGE_CODES), you can initiate an array to store the running tally for each market code encountered (I'll call it $exchange_counts); this array should be declared one time before the loop is started. Inside the loop, you can use strpos() and substr() to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key in EXCHANGE_CODES, declare the found associated value, and increment the respective exchange count.


  2. I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.


  3. When preparing the sector slug value, you are passing single-element arrays to str_replace() this is unnecessary -- just pass as single strings.


  4. Use a single space after commas when writing function parameters, as well as on either side of all =.


  5. I don't know if $s and $arr are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing the quote subarray, then you might like to declare $quote = $array['quote']; early in your script to allow for the simpler use of $quote. This isn't a big deal, just something to consider.


  6. Changing your current working directory to the new directory will spare you needing to add the variable to the glob() parameter AND it will shorten the strings that are being filtered -- meaning less work for php.


  7. You can put glob()'s excellent filtering feature to good use and avoid calling your static method entirely.



Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.



Here's some untested code to reflect my advice:



// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];

// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/

// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));

$exchange_market = strtolower($arr["quote"]["primaryExchange"]);

// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];

$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));

$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);

$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;

if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}

$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;

$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);

if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}





share|improve this answer









$endgroup$



I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)




  1. I go to great lengths to avoid the use of "large-battery" if blocks and switch blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call it const EXCHANGE_CODES), you can initiate an array to store the running tally for each market code encountered (I'll call it $exchange_counts); this array should be declared one time before the loop is started. Inside the loop, you can use strpos() and substr() to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key in EXCHANGE_CODES, declare the found associated value, and increment the respective exchange count.


  2. I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.


  3. When preparing the sector slug value, you are passing single-element arrays to str_replace() this is unnecessary -- just pass as single strings.


  4. Use a single space after commas when writing function parameters, as well as on either side of all =.


  5. I don't know if $s and $arr are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing the quote subarray, then you might like to declare $quote = $array['quote']; early in your script to allow for the simpler use of $quote. This isn't a big deal, just something to consider.


  6. Changing your current working directory to the new directory will spare you needing to add the variable to the glob() parameter AND it will shorten the strings that are being filtered -- meaning less work for php.


  7. You can put glob()'s excellent filtering feature to good use and avoid calling your static method entirely.



Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.



Here's some untested code to reflect my advice:



// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];

// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/

// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));

$exchange_market = strtolower($arr["quote"]["primaryExchange"]);

// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];

$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));

$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);

$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory

$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;

if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}

$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;

$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);

if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}






share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 18 at 0:36









mickmackusamickmackusa

1,749218




1,749218








  • 1




    $begingroup$
    Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that chdir() works on the subsequent iterations. (or you can ignore the chdir() advice and just pad the strings with $dir as you perform your glob() and unlink() processes)
    $endgroup$
    – mickmackusa
    Mar 18 at 1:58












  • $begingroup$
    On second thought, $dir probably takes care of that concern.
    $endgroup$
    – mickmackusa
    Mar 18 at 5:29














  • 1




    $begingroup$
    Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that chdir() works on the subsequent iterations. (or you can ignore the chdir() advice and just pad the strings with $dir as you perform your glob() and unlink() processes)
    $endgroup$
    – mickmackusa
    Mar 18 at 1:58












  • $begingroup$
    On second thought, $dir probably takes care of that concern.
    $endgroup$
    – mickmackusa
    Mar 18 at 5:29








1




1




$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that chdir() works on the subsequent iterations. (or you can ignore the chdir() advice and just pad the strings with $dir as you perform your glob() and unlink() processes)
$endgroup$
– mickmackusa
Mar 18 at 1:58






$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that chdir() works on the subsequent iterations. (or you can ignore the chdir() advice and just pad the strings with $dir as you perform your glob() and unlink() processes)
$endgroup$
– mickmackusa
Mar 18 at 1:58














$begingroup$
On second thought, $dir probably takes care of that concern.
$endgroup$
– mickmackusa
Mar 18 at 5:29




$begingroup$
On second thought, $dir probably takes care of that concern.
$endgroup$
– mickmackusa
Mar 18 at 5:29


















draft saved

draft discarded




















































Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215578%2fwriting-a-100kb-html-string-over-an-md-file-number-of-iterations-10k%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

is 'sed' thread safeWhat should someone know about using Python scripts in the shell?Nexenta bash script uses...

How do i solve the “ No module named 'mlxtend' ” issue on Jupyter?

Pilgersdorf Inhaltsverzeichnis Geografie | Geschichte | Bevölkerungsentwicklung | Politik | Kultur...