Writing a ~100Kb HTML string over an MD file (number of iterations ~10K)Writing strcat (string concatenate)...
Biological Blimps: Propulsion
Engineer refusing to file/disclose patents
MAXDOP Settings for SQL Server 2014
anything or something to eat
Is it improper etiquette to ask your opponent what his/her rating is before the game?
What is the grammatical term for “‑ed” words like these?
Has Darkwing Duck ever met Scrooge McDuck?
Filling the middle of a torus in Tikz
Folder comparison
Character escape sequences for ">"
Should I stop contributing to retirement accounts?
Translation of Scottish 16th century church stained glass
Query about absorption line spectra
Can I use my Chinese passport to enter China after I acquired another citizenship?
Can a Necromancer Reuse the corpses left behind from slain Undead?
When quoting, must I also copy hyphens used to divide words that continue on the next line?
Open a doc from terminal, but not by its name
A social experiment. What is the worst that can happen?
Open problems concerning all the finite groups
Varistor? Purpose and principle
Is there a conventional notation or name for the slip angle?
How should I respond when I lied about my education and the company finds out through background check?
Can someone explain how this makes sense electrically?
Python script not running correctly when launched with crontab
Writing a ~100Kb HTML string over an MD file (number of iterations ~10K)
Writing strcat (string concatenate) in COptimize way of reading and writing file in node.jsLooping over files in C++ and changing their names to store data into a single fileWriting strings to a fileGenerate Random String and Writing to FileReading and writing to fileJava String iterationsWriting strings to the screen over timeReplace string in fileFormatting fractions found in a given string to html fraction
$begingroup$
I have tried to write a large string (~100-120 Kb of HTML) on an md
file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.
There is also a low (~1%-2%) probability that the target filename has an old name (previousName
), not exactly matched with a new name (newName
), because the data flows through an API.
Key Script: Inside For
Loop
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market
if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}
$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a
$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;
// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);
// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}
// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML
if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
var_dump($hay)
string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...
>5% Probability of var_dump($lurl)
For strtolower($symb)===strtolower($cn)
string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"
<95% Probability of var_dump($lurl)
For not strtolower($symb)===strtolower($cn)
string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"
$symb
:
Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.
"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"
Would you be so kind and help me to modify it with a faster/simpler script?
performance beginner php strings file-system
$endgroup$
add a comment |
$begingroup$
I have tried to write a large string (~100-120 Kb of HTML) on an md
file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.
There is also a low (~1%-2%) probability that the target filename has an old name (previousName
), not exactly matched with a new name (newName
), because the data flows through an API.
Key Script: Inside For
Loop
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market
if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}
$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a
$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;
// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);
// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}
// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML
if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
var_dump($hay)
string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...
>5% Probability of var_dump($lurl)
For strtolower($symb)===strtolower($cn)
string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"
<95% Probability of var_dump($lurl)
For not strtolower($symb)===strtolower($cn)
string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"
$symb
:
Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.
"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"
Would you be so kind and help me to modify it with a faster/simpler script?
performance beginner php strings file-system
$endgroup$
1
$begingroup$
You could dosubstr($hey, 0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what$hey
looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27
1
$begingroup$
@Emma I have advice to give, but I don't know what$symb
is / is doing.
$endgroup$
– mickmackusa
Mar 17 at 23:44
1
$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is$symb
? (If you "ping" me on pages where I have not been active, I will not be alerted.)
$endgroup$
– mickmackusa
Mar 18 at 0:01
add a comment |
$begingroup$
I have tried to write a large string (~100-120 Kb of HTML) on an md
file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.
There is also a low (~1%-2%) probability that the target filename has an old name (previousName
), not exactly matched with a new name (newName
), because the data flows through an API.
Key Script: Inside For
Loop
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market
if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}
$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a
$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;
// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);
// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}
// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML
if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
var_dump($hay)
string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...
>5% Probability of var_dump($lurl)
For strtolower($symb)===strtolower($cn)
string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"
<95% Probability of var_dump($lurl)
For not strtolower($symb)===strtolower($cn)
string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"
$symb
:
Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.
"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"
Would you be so kind and help me to modify it with a faster/simpler script?
performance beginner php strings file-system
$endgroup$
I have tried to write a large string (~100-120 Kb of HTML) on an md
file, and am pretty sure it's not the fastest method, even though it only has to iterate ~8000-10,000 times and few times per hour.
There is also a low (~1%-2%) probability that the target filename has an old name (previousName
), not exactly matched with a new name (newName
), because the data flows through an API.
Key Script: Inside For
Loop
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // slug company
$hay=strtolower($arr["quote"]["primaryExchange"]); // exchange market
if(strpos($hay, 'nasdaq')===0){
$mk='nasdaq-us';
$nasdaq++;
}elseif(strpos($hay, 'nyse')===0 || strpos($hay, 'new york')===0){
$mk='nyse-us';
$nyse++;
}elseif(strpos($hay, 'cboe')===0){
$mk='cboe-us';
$cboe++;
}else{
$mk='market-us';
$others++;
}
$sc=str_replace(array(' '), array('-'), strtolower($s["quote"]["sector"])); // slug sector
$enc=UpdateStocks::getEnc($symb,$symb,$symb,self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND); // simple 4 length encryption output: e.g., 159a
$dir=__DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
if(!is_dir($dir)){mkdir($dir, 0755,true);} // creates price targets directory if not exist
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
// new md filename
$newName=$dir . self::SLASH . $lurl . self::EXTENSION_MD;
// Replace multiple dashes with single dash: "aa-alcoa-basic-materials-nyse-us-159a"
$newName = preg_replace('/-{2,}/', '-', $newName);
// if file not exist: generate file
if($previousNames==null){
$fh=fopen($newName, 'wb');
fwrite($fh, '');
fclose($fh);
}else{
// if file not exist:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
}
// This method is not review required now.
$mdFileContent=UpdateStocks::getBaseHTML($s,$l,$z); // gets HTML
if(file_exists($newName)){
if(is_writable($newName)){
file_put_contents($newName,$mdFileContent);
echo $symb. " 💚 " . self::NEW_LINE;
}else{
echo $symb . " symbol file in front directory is not writable in " . __METHOD__ . " 💔" . self::NEW_LINE;
}
}else{
echo $symb . " file not found in " . __METHOD__ . " 💛" . self::NEW_LINE;
}
}
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
var_dump($hay)
string(20) "nasdaq global"
string(23) "new york stock exchange"
string(20) "nasdaq global market"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "nyse arca"
string(23) "new york stock exchange"
string(23) "nyse"
string(20) "nasdaq"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "cboe"
string(23) "new york stock exchange"
string(20) "nasdaq global select"
string(20) "nasdaq global select"
string(23) "new york stock exchange"
string(23) "new york stock exchange"
...
>5% Probability of var_dump($lurl)
For strtolower($symb)===strtolower($cn)
string(27) "aac-healthcare-nyse-us-e92a"
string(35) "aaon-basic-materials-nasdaq-us-238e"
string(28) "abb-industrials-nyse-us-a407"
string(38) "acnb-financial-services-nasdaq-us-19fa"
<95% Probability of var_dump($lurl)
For not strtolower($symb)===strtolower($cn)
string(50) "aadr-advisorshares-dorsey-wright-adr--nyse-us-d842"
string(39) "aal-airlines-industrials-nasdaq-us-29eb"
string(68) "aamc-altisource-asset-management-com-financial-services-nyse-us-b46a"
string(47) "aame-atlantic-financial-services-nasdaq-us-8944"
string(35) "aan-aarons-industrials-nyse-us-d00e"
string(54) "aaoi-applied-optoelectronics-technology-nasdaq-us-1dee"
string(56) "aap-advance-auto-parts-wi-consumer-cyclical-nyse-us-1f60"
string(36) "aapl-apple-technology-nasdaq-us-8f4c"
string(35) "aat-assets-real-estate-nyse-us-3598"
string(49) "aau-almaden-minerals-basic-materials-nyse-us-1c57"
string(51) "aaww-atlas-air-worldwide-industrials-nasdaq-us-69f3"
string(59) "aaxj-ishares-msci-all-country-asia-ex-japan--nasdaq-us-c6c4"
string(47) "aaxn-axon-enterprise-industrials-nasdaq-us-0eef"
string(58) "ab-alliancebernstein-units-financial-services-nyse-us-deb1"
$symb
:
Is a uppercase string, stands for a "symbol" of an equity, sometimes with dashes.
"AADR"
"AAL"
"AAMC"
"AAME"
"AAN"
"AAOI"
"AAP"
"AAPL"
"AAT"
"AAU"
"AAWW"
"AAXJ"
"AAXN"
"AB"
"GS-A"
"GS-B"
"GS-C"
Would you be so kind and help me to modify it with a faster/simpler script?
performance beginner php strings file-system
performance beginner php strings file-system
edited Mar 18 at 0:15
Emma
asked Mar 16 at 19:19
EmmaEmma
198112
198112
1
$begingroup$
You could dosubstr($hey, 0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what$hey
looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27
1
$begingroup$
@Emma I have advice to give, but I don't know what$symb
is / is doing.
$endgroup$
– mickmackusa
Mar 17 at 23:44
1
$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is$symb
? (If you "ping" me on pages where I have not been active, I will not be alerted.)
$endgroup$
– mickmackusa
Mar 18 at 0:01
add a comment |
1
$begingroup$
You could dosubstr($hey, 0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what$hey
looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27
1
$begingroup$
@Emma I have advice to give, but I don't know what$symb
is / is doing.
$endgroup$
– mickmackusa
Mar 17 at 23:44
1
$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is$symb
? (If you "ping" me on pages where I have not been active, I will not be alerted.)
$endgroup$
– mickmackusa
Mar 18 at 0:01
1
1
$begingroup$
You could do
substr($hey, 0,2)
once and then check the first two letters instead of multiple strpos
. Maybe, but I have no idea what $hey
looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27
$begingroup$
You could do
substr($hey, 0,2)
once and then check the first two letters instead of multiple strpos
. Maybe, but I have no idea what $hey
looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27
1
1
$begingroup$
@Emma I have advice to give, but I don't know what
$symb
is / is doing.$endgroup$
– mickmackusa
Mar 17 at 23:44
$begingroup$
@Emma I have advice to give, but I don't know what
$symb
is / is doing.$endgroup$
– mickmackusa
Mar 17 at 23:44
1
1
$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is
$symb
? (If you "ping" me on pages where I have not been active, I will not be alerted.)$endgroup$
– mickmackusa
Mar 18 at 0:01
$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is
$symb
? (If you "ping" me on pages where I have not been active, I will not be alerted.)$endgroup$
– mickmackusa
Mar 18 at 0:01
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
This is what I meant in the comments
You could do
substr($hey,0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what $hey looks like :) - but than you couldswitch
on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}
This way your doing 1 function call instead of up to 4.
It looks like your calling strtolower
more than 3 times on $cn
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
if(strtolower($symb)===strtolower($cn)){
And so forth.
There may be other duplicate calls like this.
Your sprintf
seem point less.
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//you could just do this for example
$p1=self::SLASH.$symb.'-';
This whole chunk is suspect:
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
For example the only difference is this:
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
So if you could change the last argument and prepend $symb
, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:
if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";
But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.
For this one:
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
You can use preg_grep
. For example:
public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}
Your just finding if $re
is contained within each element of $array
. preg_grep — Return array entries that match the pattern
. It's also case insensitive with the i
flag. In any case I never use array_push
as $arr[]=$str
is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.
One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.
One last thing this one scares me a bit:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Here your checking that $k
or the array key is 0
, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.
foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.
Hope it helps you, most of these are minor things, really.
$endgroup$
2
$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06
$begingroup$
@Emma - is this part of your code?UpdateStocks::searchFilenames
you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09
1
$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29
1
$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43
add a comment |
$begingroup$
I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)
I go to great lengths to avoid the use of "large-battery"
if
blocks andswitch
blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call itconst EXCHANGE_CODES
), you can initiate an array to store the running tally for each market code encountered (I'll call it$exchange_counts
); this array should be declared one time before the loop is started. Inside the loop, you can usestrpos()
andsubstr()
to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key inEXCHANGE_CODES
, declare the found associated value, and increment the respective exchange count.I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.
When preparing the sector slug value, you are passing single-element arrays to
str_replace()
this is unnecessary -- just pass as single strings.Use a single space after commas when writing function parameters, as well as on either side of all
=
.I don't know if
$s
and$arr
are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing thequote
subarray, then you might like to declare$quote = $array['quote'];
early in your script to allow for the simpler use of$quote
. This isn't a big deal, just something to consider.Changing your current working directory to the new directory will spare you needing to add the variable to the
glob()
parameter AND it will shorten the strings that are being filtered -- meaning less work for php.You can put
glob()
's excellent filtering feature to good use and avoid calling your static method entirely.
Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.
Here's some untested code to reflect my advice:
// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];
// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/
// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));
$exchange_market = strtolower($arr["quote"]["primaryExchange"]);
// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];
$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));
$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);
$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;
if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}
$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;
$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);
if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}
$endgroup$
1
$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so thatchdir()
works on the subsequent iterations. (or you can ignore thechdir()
advice and just pad the strings with$dir
as you perform yourglob()
andunlink()
processes)
$endgroup$
– mickmackusa
Mar 18 at 1:58
$begingroup$
On second thought,$dir
probably takes care of that concern.
$endgroup$
– mickmackusa
Mar 18 at 5:29
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215578%2fwriting-a-100kb-html-string-over-an-md-file-number-of-iterations-10k%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
This is what I meant in the comments
You could do
substr($hey,0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what $hey looks like :) - but than you couldswitch
on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}
This way your doing 1 function call instead of up to 4.
It looks like your calling strtolower
more than 3 times on $cn
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
if(strtolower($symb)===strtolower($cn)){
And so forth.
There may be other duplicate calls like this.
Your sprintf
seem point less.
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//you could just do this for example
$p1=self::SLASH.$symb.'-';
This whole chunk is suspect:
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
For example the only difference is this:
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
So if you could change the last argument and prepend $symb
, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:
if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";
But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.
For this one:
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
You can use preg_grep
. For example:
public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}
Your just finding if $re
is contained within each element of $array
. preg_grep — Return array entries that match the pattern
. It's also case insensitive with the i
flag. In any case I never use array_push
as $arr[]=$str
is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.
One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.
One last thing this one scares me a bit:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Here your checking that $k
or the array key is 0
, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.
foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.
Hope it helps you, most of these are minor things, really.
$endgroup$
2
$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06
$begingroup$
@Emma - is this part of your code?UpdateStocks::searchFilenames
you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09
1
$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29
1
$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43
add a comment |
$begingroup$
This is what I meant in the comments
You could do
substr($hey,0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what $hey looks like :) - but than you couldswitch
on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}
This way your doing 1 function call instead of up to 4.
It looks like your calling strtolower
more than 3 times on $cn
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
if(strtolower($symb)===strtolower($cn)){
And so forth.
There may be other duplicate calls like this.
Your sprintf
seem point less.
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//you could just do this for example
$p1=self::SLASH.$symb.'-';
This whole chunk is suspect:
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
For example the only difference is this:
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
So if you could change the last argument and prepend $symb
, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:
if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";
But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.
For this one:
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
You can use preg_grep
. For example:
public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}
Your just finding if $re
is contained within each element of $array
. preg_grep — Return array entries that match the pattern
. It's also case insensitive with the i
flag. In any case I never use array_push
as $arr[]=$str
is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.
One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.
One last thing this one scares me a bit:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Here your checking that $k
or the array key is 0
, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.
foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.
Hope it helps you, most of these are minor things, really.
$endgroup$
2
$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06
$begingroup$
@Emma - is this part of your code?UpdateStocks::searchFilenames
you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09
1
$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29
1
$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43
add a comment |
$begingroup$
This is what I meant in the comments
You could do
substr($hey,0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what $hey looks like :) - but than you couldswitch
on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}
This way your doing 1 function call instead of up to 4.
It looks like your calling strtolower
more than 3 times on $cn
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
if(strtolower($symb)===strtolower($cn)){
And so forth.
There may be other duplicate calls like this.
Your sprintf
seem point less.
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//you could just do this for example
$p1=self::SLASH.$symb.'-';
This whole chunk is suspect:
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
For example the only difference is this:
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
So if you could change the last argument and prepend $symb
, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:
if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";
But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.
For this one:
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
You can use preg_grep
. For example:
public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}
Your just finding if $re
is contained within each element of $array
. preg_grep — Return array entries that match the pattern
. It's also case insensitive with the i
flag. In any case I never use array_push
as $arr[]=$str
is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.
One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.
One last thing this one scares me a bit:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Here your checking that $k
or the array key is 0
, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.
foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.
Hope it helps you, most of these are minor things, really.
$endgroup$
This is what I meant in the comments
You could do
substr($hey,0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what $hey looks like :) - but than you couldswitch
on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.
switch(substr($hay,0,2)){
case 'na': //nasdaq
$mk='nasdaq-us';
$nasdaq++
break;
case 'ny': //nyse
case 'ne': //new york
$mk='nyse-us';
$nyse++;
break;
case 'cb': //cboe
$mk='cboe-us';
$cboe++;
break;
default:
$mk='market-us';
$others++;
break;
}
This way your doing 1 function call instead of up to 4.
It looks like your calling strtolower
more than 3 times on $cn
$cn=strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"])); // s
//...
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//...
if(strtolower($symb)===strtolower($cn)){
//------------------
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
if(strtolower($symb)===strtolower($cn)){
And so forth.
There may be other duplicate calls like this.
Your sprintf
seem point less.
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
//you could just do this for example
$p1=self::SLASH.$symb.'-';
This whole chunk is suspect:
// symbol in url
$p1=sprintf('%s%s%s',self::SLASH,strtolower($symb),'-');
// company in url
$p2=sprintf('%s%s%s',self::SLASH,strtolower($cn),'-');
// duplication risk
if(strtolower($symb)===strtolower($cn)){
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
}else{
// duplicated name from one symbol
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
$lurl=strtolower($symb) . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
}
For example the only difference is this:
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p2));
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p1));
//and
$lurl=$cn . '-' . $sc . '-' . $mk . '-' . $enc;
$lurl=$symb . '-' . $cn . '-' . $sc . '-' . $mk . '-' . $enc;
So if you could change the last argument and prepend $symb
, you could maybe eliminate this condition. I have to think about it a bit... lol. But you see what I mean it could be more DRY (Don't repeat yourself). I don't know enough about the data to really say on this one. I was thinking something like this:
if($symb != $cn){
$p = self::SLASH.$symb.'-';
$lurl='';
}else{
$p = self::SLASH.$cn.'-';
$lurl= $symb;
}
$previousNames=array_reverse(UpdateStocks::searchFilenames(glob($dir."/*"),$p));
$lurl .= "$cn-$sc-$mk-$enc";
But I am not sure if I got everything strait, lol. So make sure to test it. Kind of hard just working it out in my head. Still need a condition but it's a lot shorter and easier to read.
For this one:
searchFilenames
/**
*
* @return an array with values of paths of all front md files stored
*/
public static function searchFilenames($array,$re){
$arr= array();
foreach($array as $k=>$str){
$pos=strpos($str, $re);
if($pos!==false){
array_push($arr, $str);
}
}
return $arr;
}
You can use preg_grep
. For example:
public static function searchFilenames($array,$re){
return preg_grep('/'.preg_quote($re,'/').'/i',$array);
}
//or array_filter
public static function searchFilenames($array,$re){
return array_filter($array,function($item)use($re){ return strpos($re)!==false;});
}
Your just finding if $re
is contained within each element of $array
. preg_grep — Return array entries that match the pattern
. It's also case insensitive with the i
flag. In any case I never use array_push
as $arr[]=$str
is much faster. It's even better if you can just modify the array, as this is a function it's like a copy anyway as it's not passed by reference.
One thing I find useful is to take and add some example data values in to the code in comments. Then you can visualize what tranforms your doing and if your repeating yourself.
One last thing this one scares me a bit:
foreach($previousNames as $k=>$previousName){
if($k==0){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Here your checking that $k
or the array key is 0
, it's very easy to reset array keys when sorting or filtering. So be careful with that, I would think this to be a safer option.
foreach($previousNames as $k=>$previousName){
if($previousName!=$newName){
// safety: if previous filename not exactly equal to new filename
rename($previousName, $newName);
}else{
// in case multiple files found: unlink
unlink($previousName);
}
}
Not sure if that was a mistake, or maybe I just don't understand that part? It hard without being able to test what the value is. But it warranted mention, once the stuff is deleted its deleted.
Hope it helps you, most of these are minor things, really.
edited Mar 16 at 20:55
answered Mar 16 at 19:33
ArtisticPhoenixArtisticPhoenix
37617
37617
2
$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06
$begingroup$
@Emma - is this part of your code?UpdateStocks::searchFilenames
you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09
1
$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29
1
$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43
add a comment |
2
$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06
$begingroup$
@Emma - is this part of your code?UpdateStocks::searchFilenames
you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09
1
$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29
1
$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43
2
2
$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06
$begingroup$
Well I usually go over my own code at least three times, once to make it work, once to make it readable and clean up, and once for performance and security. It's like a process like writing a paper.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:06
$begingroup$
@Emma - is this part of your code?
UpdateStocks::searchFilenames
you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09
$begingroup$
@Emma - is this part of your code?
UpdateStocks::searchFilenames
you could re-write that or write a another one so that it avoids the array reverse, for example. Any function calls you can avoid will make it faster and simpler.$endgroup$
– ArtisticPhoenix
Mar 16 at 20:09
1
1
$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29
$begingroup$
Yea it's mostly small things, which are easy to get in there when your trying to get it to work. It's got to work first, then you can go look at the logic and flow of things. And see where you can reduce the complexity. That usually makes it easier to read, faster, and less error prone.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:29
1
1
$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43
$begingroup$
I type A LOT of code, so I try to type as little as possible, lol.
$endgroup$
– ArtisticPhoenix
Mar 16 at 20:43
add a comment |
$begingroup$
I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)
I go to great lengths to avoid the use of "large-battery"
if
blocks andswitch
blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call itconst EXCHANGE_CODES
), you can initiate an array to store the running tally for each market code encountered (I'll call it$exchange_counts
); this array should be declared one time before the loop is started. Inside the loop, you can usestrpos()
andsubstr()
to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key inEXCHANGE_CODES
, declare the found associated value, and increment the respective exchange count.I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.
When preparing the sector slug value, you are passing single-element arrays to
str_replace()
this is unnecessary -- just pass as single strings.Use a single space after commas when writing function parameters, as well as on either side of all
=
.I don't know if
$s
and$arr
are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing thequote
subarray, then you might like to declare$quote = $array['quote'];
early in your script to allow for the simpler use of$quote
. This isn't a big deal, just something to consider.Changing your current working directory to the new directory will spare you needing to add the variable to the
glob()
parameter AND it will shorten the strings that are being filtered -- meaning less work for php.You can put
glob()
's excellent filtering feature to good use and avoid calling your static method entirely.
Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.
Here's some untested code to reflect my advice:
// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];
// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/
// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));
$exchange_market = strtolower($arr["quote"]["primaryExchange"]);
// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];
$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));
$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);
$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;
if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}
$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;
$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);
if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}
$endgroup$
1
$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so thatchdir()
works on the subsequent iterations. (or you can ignore thechdir()
advice and just pad the strings with$dir
as you perform yourglob()
andunlink()
processes)
$endgroup$
– mickmackusa
Mar 18 at 1:58
$begingroup$
On second thought,$dir
probably takes care of that concern.
$endgroup$
– mickmackusa
Mar 18 at 5:29
add a comment |
$begingroup$
I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)
I go to great lengths to avoid the use of "large-battery"
if
blocks andswitch
blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call itconst EXCHANGE_CODES
), you can initiate an array to store the running tally for each market code encountered (I'll call it$exchange_counts
); this array should be declared one time before the loop is started. Inside the loop, you can usestrpos()
andsubstr()
to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key inEXCHANGE_CODES
, declare the found associated value, and increment the respective exchange count.I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.
When preparing the sector slug value, you are passing single-element arrays to
str_replace()
this is unnecessary -- just pass as single strings.Use a single space after commas when writing function parameters, as well as on either side of all
=
.I don't know if
$s
and$arr
are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing thequote
subarray, then you might like to declare$quote = $array['quote'];
early in your script to allow for the simpler use of$quote
. This isn't a big deal, just something to consider.Changing your current working directory to the new directory will spare you needing to add the variable to the
glob()
parameter AND it will shorten the strings that are being filtered -- meaning less work for php.You can put
glob()
's excellent filtering feature to good use and avoid calling your static method entirely.
Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.
Here's some untested code to reflect my advice:
// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];
// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/
// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));
$exchange_market = strtolower($arr["quote"]["primaryExchange"]);
// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];
$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));
$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);
$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;
if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}
$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;
$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);
if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}
$endgroup$
1
$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so thatchdir()
works on the subsequent iterations. (or you can ignore thechdir()
advice and just pad the strings with$dir
as you perform yourglob()
andunlink()
processes)
$endgroup$
– mickmackusa
Mar 18 at 1:58
$begingroup$
On second thought,$dir
probably takes care of that concern.
$endgroup$
– mickmackusa
Mar 18 at 5:29
add a comment |
$begingroup$
I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)
I go to great lengths to avoid the use of "large-battery"
if
blocks andswitch
blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call itconst EXCHANGE_CODES
), you can initiate an array to store the running tally for each market code encountered (I'll call it$exchange_counts
); this array should be declared one time before the loop is started. Inside the loop, you can usestrpos()
andsubstr()
to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key inEXCHANGE_CODES
, declare the found associated value, and increment the respective exchange count.I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.
When preparing the sector slug value, you are passing single-element arrays to
str_replace()
this is unnecessary -- just pass as single strings.Use a single space after commas when writing function parameters, as well as on either side of all
=
.I don't know if
$s
and$arr
are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing thequote
subarray, then you might like to declare$quote = $array['quote'];
early in your script to allow for the simpler use of$quote
. This isn't a big deal, just something to consider.Changing your current working directory to the new directory will spare you needing to add the variable to the
glob()
parameter AND it will shorten the strings that are being filtered -- meaning less work for php.You can put
glob()
's excellent filtering feature to good use and avoid calling your static method entirely.
Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.
Here's some untested code to reflect my advice:
// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];
// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/
// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));
$exchange_market = strtolower($arr["quote"]["primaryExchange"]);
// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];
$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));
$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);
$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;
if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}
$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;
$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);
if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}
$endgroup$
I am only discovering this question after attending to this associated question from the OP: Writing and Updating ~8K-10K Iterations of URLs Strings on a Text File (PHP, Performance, CRON)
I go to great lengths to avoid the use of "large-battery"
if
blocks andswitch
blocks in my code because they are so verbose. You have some predictable/static exchange codes so you can craft a single lookup array at the start of your class and leverage that for all subsequent processes. Having a single lookup array in an easy-to-find location will make your code more manageable for you and other developers. Once you have your lookup array (I'll call itconst EXCHANGE_CODES
), you can initiate an array to store the running tally for each market code encountered (I'll call it$exchange_counts
); this array should be declared one time before the loop is started. Inside the loop, you can usestrpos()
andsubstr()
to extract the targeted substrings that you posted in your question. Then simply check if the substring exists as a key inEXCHANGE_CODES
, declare the found associated value, and increment the respective exchange count.I see that you are using very few characters in your variable declarations. This requires you to write comments at the end of each line to remind you and other developers what data is held in the variable. This is needlessly inconvenient. Better practice would be to assign meaningful names to your variables.
When preparing the sector slug value, you are passing single-element arrays to
str_replace()
this is unnecessary -- just pass as single strings.Use a single space after commas when writing function parameters, as well as on either side of all
=
.I don't know if
$s
and$arr
are the same incoming array and it is a typo while posting or if they are separate incoming arrays. Either way, the variable names should be more informative. If your script is always accessing thequote
subarray, then you might like to declare$quote = $array['quote'];
early in your script to allow for the simpler use of$quote
. This isn't a big deal, just something to consider.Changing your current working directory to the new directory will spare you needing to add the variable to the
glob()
parameter AND it will shorten the strings that are being filtered -- meaning less work for php.You can put
glob()
's excellent filtering feature to good use and avoid calling your static method entirely.
Finally, as I said in your SO question, you should try to consolidate and minimize total file writes if possible.
Here's some untested code to reflect my advice:
// this can be declared with your other class constants (array declaration available from php5.6+):
const EXCHANGE_CODES = [
"nasdaq" => "nasdaq-us",
"nyse" => "nyse-us",
"new york" => "nyse-us",
"cboe" => "cboe-us"
];
// initialize assoc array of counts prior to your loop
$exchange_counts = array_fill_keys(self::EXCHANGE_CODES + ["others"], 0);
/* makes:
* array (
* 'nasdaq-us' => 0,
* 'nyse-us' => 0,
* 'cboe-us' => 0,
* 'others' => 0,
* )
*/
// start actual processing
$company_slug = strtolower(UpdateStocks::slugCompany($s["quote"]["companyName"]));
$exchange_market = strtolower($arr["quote"]["primaryExchange"]);
// lookup market_code using company_name truncated after first encountered space after 4th character
$leading_text = substr($exchange_market, 0, strpos($exchange_market, ' ', 4));
$market_code = $exchange_codes[$leading_text] ?? 'others'; // null coalescing operator from php7+)
++$exchange_counts[$market_code];
$sector_slug = str_replace(' ', '-', strtolower($s["quote"]["sector"]));
$random_string = UpdateStocks::getEnc($symb, $symb, $symb, self::START_POINT_URL_ENCRYPTION_APPEND, self::LENGTH_URL_ENCRYPTION_APPEND);
$dir = __DIR__ . self::DIR_FRONT_SYMBOLS_MD_FILES; // symbols front directory
$equity_symbol = strtolower($equity_symbol);
$slug_start = $company_slug === $equity_symbol ? $company_slug : $equity_symbol;
if (!is_dir($dir)) {
mkdir($dir, 0755, true); // creates price targets directory if not exist (recursively)
} else {
chdir($dir); // change current working directory
$preexisting_files = glob("{$slug_start}-*"); // separate static method call is avoided entirely (not sure why you are reversing)
// if you want to eradicate near duplicate files, okay, but tread carefully -- it's permanent.
}
$new_slug = $slug_start . '-' . $sector_slug . '-' . $market_code . '-' . $random_string;
$new_md_filename = preg_replace('/-{2,}/', '-', $dir . self::SLASH . $new_slug . self::EXTENSION_MD);
if (empty($preexisting_files)) {
// I don't advise the iterated opening,writing an empty file,closing 10,000x
}
answered Mar 18 at 0:36
mickmackusamickmackusa
1,749218
1,749218
1
$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so thatchdir()
works on the subsequent iterations. (or you can ignore thechdir()
advice and just pad the strings with$dir
as you perform yourglob()
andunlink()
processes)
$endgroup$
– mickmackusa
Mar 18 at 1:58
$begingroup$
On second thought,$dir
probably takes care of that concern.
$endgroup$
– mickmackusa
Mar 18 at 5:29
add a comment |
1
$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so thatchdir()
works on the subsequent iterations. (or you can ignore thechdir()
advice and just pad the strings with$dir
as you perform yourglob()
andunlink()
processes)
$endgroup$
– mickmackusa
Mar 18 at 1:58
$begingroup$
On second thought,$dir
probably takes care of that concern.
$endgroup$
– mickmackusa
Mar 18 at 5:29
1
1
$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that
chdir()
works on the subsequent iterations. (or you can ignore the chdir()
advice and just pad the strings with $dir
as you perform your glob()
and unlink()
processes)$endgroup$
– mickmackusa
Mar 18 at 1:58
$begingroup$
Just occurred to me that you will need to reset the current working directory to the scripts directory after each iteration so that
chdir()
works on the subsequent iterations. (or you can ignore the chdir()
advice and just pad the strings with $dir
as you perform your glob()
and unlink()
processes)$endgroup$
– mickmackusa
Mar 18 at 1:58
$begingroup$
On second thought,
$dir
probably takes care of that concern.$endgroup$
– mickmackusa
Mar 18 at 5:29
$begingroup$
On second thought,
$dir
probably takes care of that concern.$endgroup$
– mickmackusa
Mar 18 at 5:29
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215578%2fwriting-a-100kb-html-string-over-an-md-file-number-of-iterations-10k%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
You could do
substr($hey, 0,2)
once and then check the first two letters instead of multiplestrpos
. Maybe, but I have no idea what$hey
looks like :) - but than you could switch on that and get rid of the multiple function calls. I don't think it will be much faster, but with enough iterations, who knows, and it may look a bit cleaner.$endgroup$
– ArtisticPhoenix
Mar 16 at 19:27
1
$begingroup$
@Emma I have advice to give, but I don't know what
$symb
is / is doing.$endgroup$
– mickmackusa
Mar 17 at 23:44
1
$begingroup$
Do not change your posted question here. Too late now. You'll suffer some wrath if you do. I've got some good stuff to show you on this one. (I already spent a fair amount of time on this one this morning.) What is
$symb
? (If you "ping" me on pages where I have not been active, I will not be alerted.)$endgroup$
– mickmackusa
Mar 18 at 0:01